Dr. Fan Dong is Research Fellow of Centre for Artificial Intelligence, University of Technology Sydney, NSW, Australia, and is the recipient of a CSIRO SIEF STEM+ Business Fellowship. He received dual Ph.D. degrees in computer science from Beijing Institute of Technology, China and University of Technology Sydney, Australia, in 2018.
Can supervise: YES
Current research interests
- Text classification
- Nature language processing
- Data mining and knowledgw discovering
Past resarch interests
- Concept drift detection
- Learning under concept drift
- Data stream mining under non-stationary environment
IEEE Concept drift describes unforeseeable changes in the underlying distribution of streaming data over time. Concept drift research involves the development of methodologies and techniques for drift detection, understanding and adaptation. Data analysis has revealed that machine learning in a concept drift environment will result in poor learning results if the drift is not addressed. To help researchers identify which research topics are significant and how to apply related techniques in data analysis tasks, it is necessary that a high quality, instructive review of current research developments and trends in the concept drift field is conducted. In addition, due to the rapid development of concept drift in recent years, the methodologies of learning under concept drift have become noticeably systematic, unveiling a framework which has not been mentioned in literature. This paper reviews over 130 high quality publications in concept drift related research areas, analyzes up-to-date developments in methodologies and techniques, and establishes a framework of learning under concept drift including three main components: concept drift detection, concept drift understanding, and concept drift adaptation. This paper lists and discusses 10 popular synthetic datasets and 14 publicly available benchmark datasets used for evaluating the performance of learning algorithms aiming at handling concept drift. Also, concept drift related research directions are covered and discussed. By providing state-of-the-art knowledge, this survey will directly support researchers in their understanding of research developments in the field of learning under concept drift.
Dong, F, Lu, J, Zhang, G & Li, K 2018, 'Active fuzzy weighting ensemble for dealing with concept drift', International Journal of Computational Intelligence Systems, vol. 11, no. 1, pp. 438-450.View/Download from: UTS OPUS or Publisher's site
© 2018, the Authors. The concept drift problem is a pervasive phenomenon in real-world data stream applications. It makes well-trained static learning models lose accuracy and become outdated as time goes by. The existence of different types of concept drift makes it more difficult for learning algorithms to track. This paper proposes a novel adaptive ensemble algorithm, the Active Fuzzy Weighting Ensemble, to handle data streams involving concept drift. During the processing of data instances in the data streams, our algorithm first identifies whether or not a drift occurs. Once a drift is confirmed, it uses data instances accumulated by the drift detection method to create a new base classifier. Then, it applies fuzzy instance weighting and a dynamic voting strategy to organize all the existing base classifiers to construct an ensemble learning model. Experimental evaluations on seven datasets show that our proposed algorithm can shorten the recovery time of accuracy drop when concept drift occurs, adapt to different types of concept drift, and obtain better performance with less computation costs than the other adaptive ensembles.
Dong, F, Zhang, G, Lu, J & Li, K 2018, 'Fuzzy competence model drift detection for data-driven decision support systems', Knowledge-Based Systems, vol. 143, pp. 284-294.View/Download from: UTS OPUS or Publisher's site
© 2017 Elsevier B.V. This paper focuses on concept drift in business intelligence and data-driven decision support systems (DSSs). The assumption of a fixed distribution in the data renders conventional static DSSs inaccurate and unable to make correct decisions when concept drift occurs. However, it is important to know when, how, and where concept drift occurs so a DSS can adjust its decision processing knowledge to adapt to an ever-changing environment at the appropriate time. This paper presents a data distribution-based concept drift detection method called fuzzy competence model drift detection (FCM-DD). By introducing fuzzy sets theory and replacing crisp boundaries with fuzzy ones, we have improved the competence model to provide a better, more refined empirical distribution of the data stream. FCM-DD requires no prior knowledge of the underlying distribution and provides statistical guarantee of the reliability of the detected drift, based on the theory of bootstrapping. A series of experiments show that our proposed FCM-DD method can detect drift more accurately, has good sensitivity, and is robust.
Dong, F, Lu, J, Li, K & Zhang, G 2017, 'Concept Drift Region Identification via Competence-based Discrepancy Distribution Estimation', 12th International Conference on Intelligent Systems and Knowledge and Engineering, Nanjing, China.View/Download from: UTS OPUS
Dong, F, Lu, J, Zhang, G & Li, K 2014, 'A MODIFIED LEARN++.NSE ALGORITHM FOR DEALING WITH CONCEPT DRIFT', Decision Making and Soft Computing, International Fuzzy Logic and Intelligent technologies in Nuclear Science Conference, World Scientific, Brazil, pp. 556-561.View/Download from: UTS OPUS or Publisher's site
Concept drift is a very pervasive phenomenon in real world applications. By virtue of variety change types of concept drift, it makes more difficult for learning algorithm to track the concept drift very closely. Learn++.NSE is an incremental ensemble learner without any assumption on change type of concept drift. Even though it has good performance on handling concept drift, but it costs high computation and needs more time to recover from accuracy drop. This paper proposed a modified Learn++.NSE algorithm. During learning instances in data stream, our algorithm first identifies where and when drift happened, then uses instances accumulated by drift detection method to create a new base classifier, and finally organized all existing classifiers based on Learn++.NSE weighting mechanism to update ensemble learner. This modified algorithm can reduce high computation cost without any performance drop and improve the accuracy recover speed when drift happened.