UTS site search

Dr Wei Liu

Biography

Dr Wei Liu is a Data Science Research Program Leader and a Lecturer at the Advanced Analytics Institute, Faculty of Engineering and IT, the University of Technology Sydney (UTS). Before joining UTS, he was a Data Mining Research Fellow at the University of Melbourne, and then an Industry-focused Machine Learning Researcher and Project Manager working in the transportation industry at National ICT Australia (NICTA). He obtained his PhD degree in Data Mining Research from the University of Sydney (USYD).

His research outputs are mostly published in journals and conferences that are ranked at A* and A (i.e., top-prestige) by ARC ERA 2010 ranking and by Core academic ranking. He has received 3 Best Paper Awards.

Dr Liu is interested in industry-driven data analytics research that makes real-world impact. He has led a number of significant research projects funded by government agencies and industrial organisations, spanning internet security, insurance, trading, transportation, and infrastructure sectors. He has developed advanced data mining models and software tools for the transport industry, which accurately identify causes of road incidents. He has also designed cutting-edge predictive models for problems including rare event prediction, fraud/intrusion detection, emerging trends detection, etc. Details of some of his research projects are in the below:

  • "Advanced Data Analytics Platforms without Data", industry partner: National ICT Australia; March 2016 – December 2018.
  • "Data Analytics Models for Stock Market Surveillance", industry partner: NASDAQ OMX; March 2016 - December 2018.
  • "Analytics Model to Support Strategic Planning in a Regulatory Environment", industry partner: NSW Fair Trading; April - July 2015.
  • "Transport Data Science and Advanced Analytics", industry partner: National ICT Australia; July 2015 – June 2017.
  • “Traffic Watch for Transport Control Service”, industry partner: Transport Management Centre; May 2013 – June 2014.
  • “Congestion Propagation and Hotspot Detection in Sydney CBD”, industry partner: NSW RMS; Aug – Dec 2013.
  • “Data Fusion Technologies for Comprehensive Transport Data Analysis in Melbourne”, industry partner: VicRoads; Jun – Sep 2013.
  • “Time of Arrival Estimations using HD Vehicle Trajectories”, industry partner: Tomtom. Jan 2013 – March 2013.
  • “Early Detection of Road Traffic Incidents using Social Media”, industry partner: the Transport Management Centre; Oct – Dec 2012.
  • “Causal Inference for Sequential Traffic Congestion", industry partner: Microsoft Research Asia; Nov 2010 – Mar 2011.
  • “Abnormal Claim Detection from Worker’s Compensations”, industry partner: CGU Insurance; Mar 2010 – Jun 2011.
  • “Data Integration for Cross-Market Capital Trading Systems”, industry partner: the SMARST Group (now purchased by Nasdaq), Jun 2008 – Dec 2009.

Image of Wei Liu
Lecturer, A/DRsch Advanced Analytics Institute
Core Member, AAI - Advanced Analytics Institute
Member, Institute of Electrical and Electronics Engineers
Member, Association for Computing Machinery
 
Phone
+61 2 9514 3782

Research Interests

Main Research Interests:

  • Graph mining, dynamic network analysis, tensor factorization
  • Causal inference, Granger causality
  • Game theoretical modeling, adversarial learning 
  • Data imbalance, cost-sensitive learning
  • Anomaly (outlier) detection
Can supervise: Yes

Competitive PhD scholarships are available for prospective local and international research students.

Data Mining and Knowledge Discovery; Data Analytics.

Conferences

Chen, Q., Hu, L., Xu, J., Liu, W. & cao, L. 2015, 'Document Similarity Analysis via Involving Both Explicit and Implicit Semantic Couplings', 2015 International Conference on Data Science and Advanced Analytics, Paris.
View/Download from: UTS OPUS
Jiang, X., Liu, W., Cao, L. & Long, G. 2015, 'Coupled Collaborative Filtering for Context-aware Recommendation', AAAI Publications, Twenty-Ninth AAAI Conference on Artificial Intelligence, Student Abstracts, AAAI 2015, AAAI, Austin Texas, USA, pp. 4172-4173.
View/Download from: UTS OPUS
Context-aware features have been widely recognized as important factors in recommender systems. However, as a major technique in recommender systems, traditional Collaborative Filtering (CF) does not provide a straight-forward way of integrating the context-aware information into personal recommendation. We propose a Coupled Collaborative Filtering (CCF) model to measure the contextual information and use it to improve recommendations. In the proposed approach, coupled similarity computation is designed to be calculated by interitem, intra-context and inter-context interactions among item, user and context-ware factors. Experiments based on different types of CF models demonstrate the effectiveness of our design.
Shao, J., Yin, J., Liu, W. & Cao, L. 2015, 'Mining Actionable Combined Patterns of High Utility and Frequency', Proceedings of the IEEE International Conference on Data Science and Advanced Analytics, IEEE International Conference on Data Science and Advanced Analytics, IEEE, Paris, pp. 1-10.
View/Download from: Publisher's site
In recent years, the importance of identifying actionable patterns has become increasingly recognized so that decision-support actions can be inspired by the resultant patterns. A typical shift is on identifying high utility rather than highly frequent patterns. Accordingly, High Utility Itemset (HUI) Mining methods have become quite popular as well as faster and more reliable than before. However, the current research focus has been on improving the efficiency while the coupling relationships between items are ignored. It is important to study item and itemset couplings inbuilt in the data. For example, the utility of one itemset might be lower than user-specified threshold until one additional itemset takes part in; and vice versa, an item's utility might be high until another one joins in. In this way, even though some absolutely high utility itemsets can be discovered, sometimes it is easily to find out that quite a lot of redundant itemsets sharing the same item are mined (e.g., if the utility of a diamond is high enough, all its supersets are proved to be HUIs). Such itemsets are not actionable, and sellers cannot make higher profit if marketing strategies are created on top of such findings. To this end, here we introduce a new framework for mining actionable high utility association rules, called Combined Utility-Association Rules (CUAR), which aims to find high utility and strong association of itemset combinations incorporating item/itemset relations. The algorithm is proved to be efficient per experimental outcomes on both real and synthetic datasets.
Luo, L., Liu, W., Koprinska, I. & Chen, F. 2015, 'Discovering causal structures from time series data via enhanced granger causality', AI 2015: Advances in Artificial Intelligence (LNCS), 28th Australasian Joint Conference on Artificial Intelligence, Springer, Canberra, Australia, pp. 365-378.
View/Download from: Publisher's site
© Springer International Publishing Switzerland 2015. Granger causality has been applied to explore predictive causal relations among multiple time series in various fields. However, the existence of nonstationary distributional changes among the time series variables poses significant challenges. By analyzing a real dataset, we observe that factors such as noise, distribution changes and shifts increase the complexity of the modelling, and large errors often occur when the underlying distribution shifts with time. Motivated by this challenge, we propose a new regression model for causal structure discovery – a Linear Model with Weighted Distribution Shift (linear WDS), which improves the prediction accuracy of the Granger causality model by taking into account the weights of the distribution-shift samples and by optimizing a quadratic-mean based objective function. The linear WDS is integrated in the Granger causality model to improve the inference of the predictive causal structure. The performance of the enhanced Granger causality model is evaluated on synthetic datasets and real traffic datasets, and the proposed model is compared with three different regression-based Granger causality models (standard linear regression, robust regression and quadratic-mean-based regression). The results show that the enhanced Granger causality model outperforms the other models especially when there are distribution shifts in the data.
Shao, J., Yin, J., Liu, W. & Cao, L. 2015, 'Actionable Combined High Utility Itemset Mining', AAAI'15 Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence Pages, pp. 4206-4207.
Luo, L., Liu, W., Koprinska, I. & Chen, F. 2015, 'Discrimination-aware association rule mining for unbiased data analytics', Big Data Analytics and Knowledge Discovery: 17th International Conference, DaWaK 2015, Valencia, Spain, September 1-4, 2015, Proceedings, International Conference on Big Data Analytics and Knowledge Discovery, Springer International Publishing, Valencia; Spain, pp. 108-120.
View/Download from: Publisher's site
A discriminatory dataset refers to a dataset with undesirable correlation between sensitive attributes and the class label, which often leads to biased decision making in data analytics processes. This paper investigates how to build discrimination-aware models even when the available training set is intrinsically discriminating based on some sensitive attributes, such as race, gender or personal status. We propose a new classification method called Discrimination-Aware Association Rule classifier (DAAR), which integrates a new discrimination-aware measure and an association rule mining algorithm. We evaluate the performance of DAAR on three real datasets from different domains and compare it with two non-discrimination-aware classifiers (a standard association rule classification algorithm and the state-of-the-art association rule algorithm SPARCCC), and also with a recently proposed discrimination-aware decision tree method. The results show that DAAR is able to effectively filter out the discriminatory rules and decrease the discrimination on all datasets with insignificant impact on the predictive accuracy.
Khoa, N.L.D., Zhang, B., Wang, Y., Liu, W., Chen, F., Mustapha, S. & Runcie, P. 2015, 'On Damage Identification in Civil Structures Using Tensor Analysis', Advances in Knowledge Discovery and Data Mining: 19th Pacific-Asia Conference Proceedings, Part 1, Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Springer International Publishing, Ho Chi Minh City, Vietnam, pp. 459-471.
View/Download from: Publisher's site
Wang, F., Liu, W. & Chawla, S. 2015, 'On Sparse Feature Attacks in Adversarial Learning', Proceedings - IEEE International Conference on Data Mining, ICDM, IEEE International Conference on Data Mining, IEEE, Shenzhen; China, pp. 1013-1018.
View/Download from: Publisher's site
Adversarial learning is the study of machine learning techniques deployed in non-benign environments. Example applications include classifications for detecting spam email, network intrusion detection and credit card scoring. In fact as the gamut of application domains of machine learning grows, the possibility and opportunity for adversarial behavior will only increase. Till now, the standard assumption about modeling adversarial behavior has been to empower an adversary to change all features of the classifier sat will. The adversary pays a cost proportional to the size of 'attack'. We refer to this form of adversarial behavior as a dense feature attack. However, the aim of an adversary is not just to subvert a classifier but carry out data transformation in a way such that spam continues to appear like spam to the user as much as possible. We demonstrate that an adversary achieves this objective by carrying out a sparse feature attack. We design an algorithm to show how a classifier should be designed to be robust against sparse adversarial attacks. Our main insight is that sparse feature attacks are best defended by designing classifiers which use l1 regularizers.