Dr Long joined UTS at 2010, and obtained his PhD degree from UTS in 2014. Before joined UTS, he has more than six years industry R&D working experience.
He is currently leading a research group to conduct application-driven research on machine learning and data mining. Particularly, his research interests focus on several application domains, such as NLP, Healthcare, Smart Home, Education and Social Media.
He was an assessor for more than 20+ ARC (Australia Research Council) proposals including DP, LP and DECRA.
He serves as a reviewer for a few top AI conferences, e.g. IJCAI. He was the Job Match Chair for KDD 2015 and IJCAI 2017. The Job Match program aims to provide a face-to-face recruitment opportunity for conference attendees and sponsors.
Dr. Long published 30+ papers on ERA Rank A conference (e.g. AAAI, ICDM, and CIKM) and journal (e.g. TKDE, TKDD, TCYB, WWW and Pattern Recognition). He currently focuses on application-driven research that aims to develop innovative ideas inspired by industry partner's real requirements.
Dr. Long currently focused on training his PhD students and delivering industry training.
Dong, W, Li, W, Long, G, Tao, Z, Li, J & Wang, K 2019, 'Electrical resistivity and mechanical properties of cementitious composite incorporating conductive rubber fibres', SMART MATERIALS AND STRUCTURES, vol. 28, no. 8.View/Download from: Publisher's site
Zhang, Q, Wu, J, Zhang, P, Long, G & Zhang, C 2019, 'Salient Subsequence Learning for Time Series Clustering', IEEE Transactions on Pattern Analysis and Machine Intelligence.View/Download from: UTS OPUS or Publisher's site
IEEE Time series has been a popular research topic over the past decade. Salient subsequences of time series that can benefit the learning task, e.g. classification or clustering, are called shapelets. Shapelet-based time series learning extracts these types of salient subsequences with highly informative features from a time series. Most existing methods for shapelet discovery must scan a large pool of candidate subsequences, which is a time-consuming process. A recent work, Grabocka:KDD14, uses regression learning to discover shapelets in a time series; however, it only considers learning shapelets from labeled time series data. This paper proposes an Unsupervised Salient Subsequence Learning (USSL) model that discovers shapelets without the effort of labeling. We developed this new learning function by integrating the strengths of shapelet learning, shapelet regularization, spectral analysis and pseudo-label to simultaneously and automatically learn shapelets to help clustering unlabeled time series better. The optimization model is iteratively solved via a coordinate descent algorithm. Experiments show that our USSL can learn meaningful shapelets, with promising results on real-world and synthetic data that surpass current state-of-the-art unsupervised time series learning methods.
Copyright © 2018 Shaoxiong Ji et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Early detection and treatment are regarded as the most effective ways to prevent suicidal ideation and potential suicide attempts-two critical risk factors resulting in successful suicides. Online communication channels are becoming a new way for people to express their suicidal tendencies. This paper presents an approach to understand suicidal ideation through online user-generated content with the goal of early detection via supervised learning. Analysing users' language preferences and topic descriptions reveals rich knowledge that can be used as an early warning system for detecting suicidal tendencies. Suicidal individuals express strong negative feelings, anxiety, and hopelessness. Suicidal thoughts may involve family and friends. And topics they discuss cover both personal and social issues. To detect suicidal ideation, we extract several informative sets of features, including statistical, syntactic, linguistic, word embedding, and topic features, and we compare six classifiers, including four traditional supervised classifiers and two neural network models. An experimental study demonstrates the feasibility and practicability of the approach and provides benchmarks for the suicidal ideation detection on the active online platforms: Reddit SuicideWatch and Twitter.
Jiang, X, Pan, S, Long, G, Xiong, F, Jiang, J & Zhang, C 2018, 'Cost-sensitive parallel learning framework for insurance intelligence operation', IEEE Transactions on Industrial Electronics.View/Download from: UTS OPUS
Zhang, Q, Wu, J, Zhang, Q, Zhang, P, Long, G & Zhang, C 2018, 'Dual influence embedded social recommendation', World Wide Web, vol. 21, no. 4, pp. 849-874.View/Download from: UTS OPUS or Publisher's site
© 2017 Springer Science+Business Media, LLC Recommender systems are designed to solve the information overload problem and have been widely studied for many years. Conventional recommender systems tend to take ratings of users on products into account. With the development of Web 2.0, Rating Networks in many online communities (e.g. Netflix and Douban) allow users not only to co-comment or co-rate their interests (e.g. movies and books), but also to build explicit social networks. Recent recommendation models use various social data, such as observable links, but these explicit pieces of social information incorporating recommendations normally adopt similarity measures (e.g. cosine similarity) to evaluate the explicit relationships in the network - they do not consider the latent and implicit relationships in the network, such as social influence. A target user's purchase behavior or interest, for instance, is not always determined by their directly connected relationships and may be significantly influenced by the high reputation of people they do not know in the network, or others who have expertise in specific domains (e.g. famous social communities). In this paper, based on the above observations, we first simulate the social influence diffusion in the network to find the global and local influence nodes and then embed this dual influence data into a traditional recommendation model to improve accuracy. Mathematically, we formulate the global and local influence data as new dual social influence regularization terms and embed them into a matrix factorization-based recommendation model. Experiments on real-world datasets demonstrate the effective performance of the proposed method.
Pan, S, Wu, J, Zhu, X, Long, G & Zhang, C 2017, 'Boosting for graph classification with universum', Knowledge and Information Systems, vol. 50, no. 1, pp. 53-77.View/Download from: UTS OPUS or Publisher's site
© 2016 Springer-Verlag London Recent years have witnessed extensive studies of graph classification due to the rapid increase in applications involving structural data and complex relationships. To support graph classification, all existing methods require that training graphs should be relevant (or belong) to the target class, but cannot integrate graphs irrelevant to the class of interest into the learning process. In this paper, we study a new universum graph classification framework which leverages additional 'non-example' graphs to help improve the graph classification accuracy. We argue that although universum graphs do not belong to the target class, they may contain meaningful structure patterns to help enrich the feature space for graph representation and classification. To support universum graph classification, we propose a mathematical programming algorithm, ugBoost, which integrates discriminative subgraph selection and margin maximization into a unified framework to fully exploit the universum. Because informative subgraph exploration in a universum setting requires the search of a large space, we derive an upper bound discriminative score for each subgraph and employ a branch-and-bound scheme to prune the search space. By using the explored subgraphs, our graph classification model intends to maximize the margin between positive and negative graphs and minimize the loss on the universum graph examples simultaneously. The subgraph exploration and the learning are integrated and performed iteratively so that each can be beneficial to the other. Experimental results and comparisons on real-world dataset demonstrate the performance of our algorithm.
Pan, S, Wu, J, Zhu, X, Long, G & Zhang, C 2017, 'Task Sensitive Feature Exploration and Learning for Multitask Graph Classification', IEEE Transactions on Cybernetics, vol. 47, no. 3, pp. 744-758.View/Download from: UTS OPUS or Publisher's site
Multitask learning (MTL) is commonly used for jointly optimizing multiple learning tasks. To date, all existing MTL methods have been designed for tasks with feature-vector represented instances, but cannot be applied to structure data, such as graphs. More importantly, when carrying out MTL, existing methods mainly focus on exploring overall commonality or disparity between tasks for learning, but cannot explicitly capture task relationships in the feature space, so they are unable to answer important questions, such as what exactly is shared between tasks and what is the uniqueness of one task differing from others? In this paper, we formulate a new multitask graph learning problem, and propose a task sensitive feature exploration and learning algorithm for multitask graph classification. Because graphs do not have features available, we advocate a task sensitive feature exploration and learning paradigm to jointly discover discriminative subgraph features across different tasks. In addition, a feature learning process is carried out to categorize each subgraph feature into one of three categories: 1) common feature; 2) task auxiliary feature; and 3) task specific feature, indicating whether the feature is shared by all tasks, by a subset of tasks, or by only one specific task, respectively. The feature learning and the multiple task learning are iteratively optimized to form a multitask graph classification model with a global optimization goal. Experiments on real-world functional brain analysis and chemical compound categorization demonstrate the algorithm's performance. Results confirm that our method can be used to explicitly capture task correlations and uniqueness in the feature space, and explicitly answer what are shared between tasks and what is the uniqueness of a specific task.
Wang, S, Li, X, Chang, X, Yao, L, Sheng, QZ & Long, G 2017, 'Learning multiple diagnosis codes for ICU patients with local disease correlation mining', ACM Transactions on Knowledge Discovery from Data, vol. 11, no. 3, pp. 1-21.View/Download from: UTS OPUS or Publisher's site
© 2017 ACM. In the era of big data, a mechanism that can automatically annotate disease codes to patients' records in the medical information system is in demand. The purpose of this work is to propose a framework that automatically annotates the disease labels of multi-source patient data in Intensive Care Units (ICUs). We extract features from two main sources, medical charts and notes. The Bag-of-Words model is used to encode the features. Unlike most of the existing multi-label learning algorithms that globally consider correlations between diseases, our model learns disease correlation locally in the patient data. To achieve this, we derive a local disease correlation representation to enrich the discriminant power of each patient data. This representation is embedded into a unified multi-label learning framework. We develop an alternating algorithm to iteratively optimize the objective function. Extensive experiments have been conducted on a real-world ICU database. We have compared our algorithm with representative multi-label learning algorithms. Evaluation results have shown that our proposed method has state-of-the-art performance in the annotation of multiple diagnostic codes for ICU patients. This study suggests that problems in the automated diagnosis code annotation can be reliably addressed by using a multi-label learning model that exploits disease correlation. The findings of this study will greatly benefit health care and management in ICU considering that the automated diagnosis code annotation can significantly improve the quality and management of health care for both patients and caregivers.
Zhang, Q, Wu, J, Zhang, P, Long, G & Zhang, C 2017, 'Collective Hyping Detection System for Identifying Online Spam Activities', IEEE INTELLIGENT SYSTEMS, vol. 32, no. 5, pp. 53-63.View/Download from: UTS OPUS
Zhang, Q, Wu, J, ZHANG, P, Long, G & Zhang, C 2017, 'Collective Hyping Detection System for Identifying Online Spam Activities', IEEE Intelligent Systems, vol. 32, no. 5.View/Download from: UTS OPUS or Publisher's site
IEEE Online reviews are extensively utilized by potential buyers to make business decisions. Unfortunately, fraudsters offer to write spam reviews for product promotion or competitor defamation, which drives online business holders to adopt this type of vicious strategy to increase their profits. These fake reviews always mislead users who shop online. Though existing anti-spam strategies have been proved to be effective in detecting traditional spam activities, evolving spam schemes can successfully cheat conventional testing by buying the comments of a massive number of random but genuine users which are sold by specific web markets, i.e., User Cloud. A more crucial problem is that such spam activities turn into a kind of 'advertising campaign' among business owners as they need to maintain their rank in the top few positions. In this paper, we propose a new Collaborative Marketing Hyping Detection solution, which aims to identify spam comments generated by the Spam Reviewer Cloud and to detect products which adopt an evolving spam strategy for promotion. Our experiments validate the existence of the Collaborative Marketing Hyping activities on a real-life e-commercial platform and also demonstrate that our model can effectively and accurately identify these advanced spam activities.
© 2014 Elsevier B.V. Feature selection improves the quality of the model by filtering out the noisy or redundant part. In the unsupervised scenarios, the selection is challenging due to the unavailability of the labels. To overcome that, the graphs which can unfold the geometry structure on the manifold are usually used to regularize the selection process. These graphs can be constructed either in the local view or the global view. As the local graph is more discriminative, previous methods tended to use the local graph rather than the global graph. But the global graph also has useful information. In light of this, in this paper, we propose a multiple graph unsupervised feature selection method to leverage the information from both local and global graphs. Besides that, we enforce the l2,p norm to achieve more flexible sparse learning. The experiments which inspect the effects of multiple graph and l2,p norm are conducted respectively on various datasets, and the comparisons to other mainstream methods are also presented in this paper. The results support that the multiple graph could be better than the single graph in the unsupervised feature selection, and the overall performance of the proposed method is higher than the other comparisons.
Wang, S, Chang, X, Li, X, Long, G, Yao, L & Sheng, Q 2016, 'Diagnosis Code Assignment Using Sparsity-based Disease Correlation Embedding', IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 12, pp. 3191-3202.View/Download from: UTS OPUS or Publisher's site
With the latest developments in database technologies, it becomes easier to store the medical records of hospital patients from their first day of admission than was previously possible. In Intensive Care Units (ICU) in the modern medical information system can record patient events in relational databases every second. Knowledge mining from these huge volumes of medical data is beneficial to both caregivers and patients. Given a set of electronic patient records, a system that effectively assigns the disease labels can facilitate medical database management and also benefit other researchers, e.g. pathologists. In this paper, we have proposed a framework to achieve that goal. Medical chart and note data of a patient are used to extract distinctive features. To encode patient features, we apply a Bag-of-Words encoding method for both chart and note data. We also propose a model that takes into account both global information and local correlations between diseases. Correlated diseases are characterized by a graph structure that is embedded in our sparsity-based framework. Our algorithm captures the disease relevance when labeling disease codes rather than making individual decision with respect to a specific disease. At the same time, the global optimal values are guaranteed by our proposed convex objective function. Extensive experiments have been conducted on a real-world large-scale ICU database. The evaluation results demonstrate that our method improves multi-label classification results by successfully incorporating disease correlations.
Wang, S, Pan, P, Long, G, Chen, W, Li, X & Sheng, QZ 2016, 'Compact representation for large-scale unconstrained video analysis', World Wide Web, vol. 19, no. 2, pp. 231-246.View/Download from: UTS OPUS or Publisher's site
Recently, newly invented features (e.g. Fisher vector, VLAD) have achieved state-of-the-art performance in large-scale video analysis systems that aims to understand the contents in videos, such as concept recognition and event detection. However, these features are in high-dimensional representations, which remarkably increases computation costs and correspondingly deteriorates the performance of subsequent learning tasks. Notably, the situation becomes even worse when dealing with large-scale video data where the number of class labels are limited. To address this problem, we propose a novel algorithm to compactly represent huge amounts of unconstrained video data. Specifically, redundant feature dimensions are removed by using our proposed feature selection algorithm. Considering unlabeled videos that are easy to obtain on the web, we apply this feature selection algorithm in a semi-supervised framework coping with a shortage of class information. Different from most of the existing semi-supervised feature selection algorithms, our proposed algorithm does not rely on manifold approximation, i.e. graph Laplacian, which is quite expensive for a large number of data. Thus, it is possible to apply the proposed algorithm to a real large-scale video analysis system. Besides, due to the difficulty of solving the non-smooth objective function, we develop an efficient iterative approach to seeking the global optimum. Extensive experiments are conducted on several real-world video datasets, including KTH, CCV, and HMDB. The experimental results have demonstrated the effectiveness of the proposed algorithm.
Zhang, P, He, J, Long, G, Huang, G & Zhang, C 2016, 'Towards anomalous diffusion sources detection in a large network', ACM Transactions on Internet Technology, vol. 16, no. 1.View/Download from: UTS OPUS or Publisher's site
© 2016 ACM. Witnessing the wide spread of malicious information in large networks, we develop an efficient method to detect anomalous diffusion sources and thus protect networks from security and privacy attacks. To date, most existing work on diffusion sources detection are based on the assumption that network snapshots that reflect information diffusion can be obtained continuously. However, obtaining snapshots of an entire network needs to deploy detectors on all network nodes and thus is very expensive. Alternatively, in this article, we study the diffusion sources locating problem by learning from information diffusion data collected from only a small subset of network nodes. Specifically, we present a new regression learning model t hat can detect anomalous diffusion sources by jointly solving five challenges, that is, unknown number of source nodes, few activated detectors, unknown initial propagation time, uncertain propagation path and uncertain propagation time delay. We theoretically analyze the strength of the model and derive performance bounds. We empirically test and compare the model using both synthetic and real-world networks to demonstrate its performance.
Zhang, Q, Zhang, P, Long, G, Ding, W, Zhang, C & Wu, X 2016, 'Online learning from trapezoidal data streams', IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 10, pp. 2709-2723.View/Download from: UTS OPUS or Publisher's site
© 1989-2012 IEEE. In this paper, we study a new problem of continuous learning from doubly-streaming data where both data volume and feature space increase over time. We refer to the doubly-streaming data as trapezoidal data streams and the corresponding learning problem as online learning from trapezoidal data streams. The problem is challenging because both data volume and data dimension increase over time, and existing online learning  ,  , online feature selection  , and streaming feature selection algorithms  ,  are inapplicable. We propose a new Online Learning with Streaming Features algorithm (OL SF for short) and its two variants, which combine online learning  ,  and streaming feature selection  ,  to enable learning from trapezoidal data streams with infinite training instances and features. When a new training instance carrying new features arrives, a classifier updates the existing features by following the passive-aggressive update rule  and updates the new features by following the structural risk minimization principle. Feature sparsity is then introduced by using the projected truncation technique. We derive performance bounds of the OL SF algorithm and its variants. We also conduct experiments on real-world data sets to show the performance of the proposed algorithms.
Chen, W & Long, G 2019, 'Deep learning for healthcare data processing', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 600-601.View/Download from: Publisher's site
© Springer Nature Switzerland AG 2019. Deep learning techniques have revolutionized many fields including computer vision, natural language processing, speech recognition, and is being fundamentally changed healthcare industries. Vary types of data have been emerging in modern healthcare research, including electronic health records (EHR), Clinical imaging, and Continuing monitoring data, which are noise, complex, high-dimensional, multi-modality, poorly annotated and generally unstructured. Healthcare applications pose many significantly different challenges to existing deep learning models. For instance, interpretation of prediction, missing value, and privacy preservation. In this tutorial, we will discuss the challenges and solutions to the problems in healthcare applications, as well as data sets and demos.
Fengwen Chen, GL 2019, 'DAGCN: Dual Attention Graph Convolutional Networks', The 2019 International Joint Conference on Neural Networks (IJCNN 2019).
Ji, S, Long, G, Pan, S, Zhu, T, Jiang, J & Wang, S 2019, 'Detecting Suicidal Ideation with Data Protection in Online Communities', DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 24th Int Conference on Database Systems for Advanced Applications / 6th Int Workshop on Big Data Management and Service / 4th Int Workshop on Big Data Quality Management / 3rd Int Workshop on Graph Data Management and Analysis, SPRINGER INTERNATIONAL PUBLISHING AG, Chiang Mai, THAILAND, pp. 225-229.View/Download from: Publisher's site
Ji, S, Pan, S, Long, G, Li, X, Jiang, J & Huang, Z 2019, 'Learning Private Neural Language Modeling with Attentive Aggregation', The 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.View/Download from: UTS OPUS
Liu, L, Zhou, T, Long, G, Yao, L, Jiang, J & Zhang, C 2019, 'Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph', The 28th International Joint Conference on Artificial Intelligence (IJCAI), Macao, China.View/Download from: UTS OPUS
Peng, X, Long, G, Pan, S, Jiang, J & Niu, Z 2019, 'Attentive Dual Embedding for Understanding Medical Concept in Electronic Health Record', The 2019 International Joint Conference on Neural Networks, Budapest, Hungary.View/Download from: UTS OPUS
Electronic health records contain a wealth of information on a patient's healthcare over many visits, such as diagnoses, treatments, drugs administered, and so on. The untapped potential of these data in healthcare analytics is vast. However, given that much of medical information is a cause and effect science, new embedding methods are required to ensure the learning representations reflect the comprehensive interplays between medical concepts and their relationships over time. Unlike one-hot encoding, a distributed representation should preserve these complex interactions as high-quality inputs for machine learning-based healthcare analytics tasks. Therefore, we propose a novel attentive dual embedding method called MC2Vec. MC2Vec captures the proximity relationships between medical concepts through a two-step optimization framework that recursively refines the embedding for superior output. The framework comprises a Skip-gram model to generate the initial embedding and an attentive CBOW model to fine-tune the embedding with temporal information gleaned from sequences of patient visits. Experiments with two public datasets demonstrate that MC2Vec's produces embeddings of higher quality than five state-of-the-art methods.
Peng, X, Long, G, Pan, S, Jiang, J & Niu, Z 2019, 'Attentive Dual Embedding for Understanding Medical Concept in Electronic Health Record', The 2019 International Joint Conference on Neural Networks (IJCNN 2019), Budapest, Hungary.
Shen, T, Zhou, T, Long, G, Jiang, J & Zhang, C 2019, 'Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together', (NAACL) Annual Conference of the North American Chapter of the Association for Computational Linguisticss, Minneapolis, USA.View/Download from: UTS OPUS
Wang, C, Pan, S, Hu, R, Long, G, Jiang, J & Zhang, C 2019, 'Attributed Graph Clustering: A Deep Attentional Embedding Approach', The 28th International Joint Conference on Artificial Intelligence (IJCAI), Macao, China.
CHEN, F, Pan, S, Jiang, J, Huo, H & Long, G 2019, 'DAGCN: Dual Attention Graph Convolutional Networks', The 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.View/Download from: UTS OPUS
Wu, D, Chen, J, Sharma, N, Pan, S, Long, G & Blumenstein, M 2019, 'Adversarial Action Data Augmentation for Similar Gesture Action Recognition', The 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.
Boroujeni, FR, Wang, S, Li, Z, West, N, Stantic, B, Yao, L & Long, G 2018, 'Trace Ratio Optimization with Feature Correlation Mining for Multiclass Discriminant Analysis', Thirty-Second AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, AAI, New Orleans, USA, pp. 2746-2753.View/Download from: UTS OPUS
Fisher's linear discriminant analysis is a widely accepted dimensionality
reduction method, which aims to find a transformation
matrix to convert feature space to a smaller space
by maximising the between-class scatter matrix while minimising
the within-class scatter matrix. Although the fast and
easy process of finding the transformation matrix has made
this method attractive, overemphasizing the large class distances
makes the criterion of this method suboptimal. In this
case, the close class pairs tend to overlap in the subspace. Despite
different weighting methods having been developed to
overcome this problem, there is still a room to improve this
issue. In this work, we study a weighted trace ratio by maximising
the harmonic mean of the multiple objective reciprocals.
To further improve the performance, we enforce the
2,1-norm to the developed objective function. Additionally,
we propose an iterative algorithm to optimise this objective
function. The proposed method avoids the domination problem
of the largest objective, and guarantees that no objectives
will be too small. This method can be more beneficial if the
number of classes is large. The extensive experiments on different
datasets show the effectiveness of our proposed method
when compared with four state-of-the-art methods.
Chen, W, Wang, S, Long, G, Yao, L, Sheng, QZ & Li, X 2018, 'Dynamic Illness Severity Prediction via Multi-task RNNs for Intensive Care Unit', 2018 IEEE International Conference on Data Mining (ICDM), International Conference on Data Mining, IEEE, Singapore, Singapore.View/Download from: UTS OPUS or Publisher's site
Jiang, X, Pan, S, Jiang, J & Long, G 2018, 'Cross-domain deep learning approach for multiple financial market prediction', 2018 International Joint Conference on Neural Networks (IJCNN), International Joint Conference on Neural Networks, IEEE, Rio de Janeiro, Brazil, pp. 1-8.View/Download from: UTS OPUS or Publisher's site
Over recent decades, globalization has resulted in a steady increase in cross-border financial flows around the world. To build an abstract representation of a real-world financial market situation, we structure the fundamental influences among homogeneous and heterogeneous markets with three types of correlations: the inner-domain correlation between homogeneous markets in various countries, the cross-domain correlation between heterogeneous markets, and the time-series correlation between current and past markets. Such types of correlations in global finance challenge traditional machine learning approaches due to model complexity and nonlinearity. In this paper, we propose a novel cross-domain deep learning approach (Cd-DLA) to learn real-world complex correlations for multiple financial market prediction. Based on recurrent neural networks, which capture the time-series interactions in financial data, our model utilizes the attention mechanism to analyze the inner-domain and cross-domain correlations, and then aggregates all of them for financial forecasting. Experiment results on ten-year financial data on currency and stock markets from three countries prove the performance of our approach over other baselines.
Jiang, X, Pan, S, Long, G, Chang, J, Jiang, J & Zhang, C 2018, 'Cost-sensitive hybrid neural networks for heterogeneous and imbalanced data', 2018 International Joint Conference on Neural Networks (IJCNN), International Joint Conference on Neural Networks, IEEE, Rio de Janeiro, Brazil, pp. 1-8.View/Download from: UTS OPUS or Publisher's site
Analyzing accumulated data has recently attracted huge attention for its ability to generate values by identifying useful information and providing an edge in global business competition. However, heterogeneous data and imbalanced class distribution present two major challenges to machine learning with real-world business data. Traditional machine learning algorithms can typically only be applied to standard data sets, which are normally homogeneous and balanced. These algorithms narrow complex data into a homogeneous, a balanced data space an inefficient process that requires a significant amount of pre-processing. In this paper, we focus on an efficient solution to the challenges with heterogeneous and imbalanced data sets that does not require pre-processing. Our approach comprises a novel, unified, end-to-end cost-sensitive hybrid neural network that learns real-world heterogeneous data via a parallel network architecture. A specifically-designed cost-sensitive matrix then automatically generates a robust model for learning minority classifications. And the parameters of both the cost-sensitive matrix and the hybrid neural network are alternately but jointly optimized during training. The results of comparative experiments on six real-world data sets reflecting actual business cases, including insurance fraud detection and mobile customer demographics, indicate that the proposed approach demonstrates superior performance over baseline procedures.
Shen, T, Zhou, T, Long, G, Jiang, J & Zhang, C 2018, 'Bi-directional block self-attention for fast and memory-efficient sequence modeling', International Conference on Representation Learning, Vancouver CANADA.View/Download from: UTS OPUS
Shen, T, Zhou, T, Long, G, Jiang, J, Pan, S & Zhang, C 2018, 'DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding', Proceedings of the 32nd AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, AAAI, New Orleans, USA, pp. 5446-5455.View/Download from: UTS OPUS
Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively. Attention mechanisms have recently attracted enormous interest due to their highly parallelizable computation, significantly less training time, and flexibility in modeling dependencies. We propose a novel attention mechanism in which the attention between elements from input sequence(s) is directional and multi-dimensional (i.e., feature-wise). A light-weight neural net, "Directional Self-Attention Network (DiSAN)," is then proposed to learn sentence embedding, based solely on the proposed attention without any RNN/CNN structure. DiSAN is only composed of a directional self-attention with temporal order encoded, followed by a multi-dimensional attention that compresses the sequence into a vector representation. Despite its simple form, DiSAN outperforms complicated RNN models on both prediction quality and time efficiency. It achieves the best test accuracy among all sentence encoding methods and improves the most recent best result by 1.02% on the Stanford Natural Language Inference (SNLI) dataset, and shows state-of-the-art test accuracy on the Stanford Sentiment Treebank (SST), Multi-Genre natural language inference (MultiNLI), Sentences Involving Compositional Knowledge (SICK), Customer Review, MPQA, TREC question-type classification and Subjectivity (SUBJ) datasets.
Shen, T, Zhou, T, Long, G, Jiang, J, Wang, S & Zhang, C 2018, 'Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling', IJCAI 2018, International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 4345-4352.View/Download from: UTS OPUS or Publisher's site
Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called "reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, "reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both the Stanford Natural Language Inference (SNLI) and the Sentences Involving Compositional Knowledge (SICK) datasets.
Zhang, S, Yao, L, Sun, A, Wang, S, Long, G & Dong, M 2018, 'NeuRec: On nonlinear transformation for personalized ranking', IJCAI International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 3669-3675.View/Download from: UTS OPUS or Publisher's site
© 2018 International Joint Conferences on Artificial Intelligence. All right reserved. Modeling user-item interaction patterns is an important task for personalized recommendations. Many recommender systems are based on the assumption that there exists a linear relationship between users and items while neglecting the intricacy and non-linearity of real-life historical interactions. In this paper, we propose a neural network based recommendation model (NeuRec) that untangles the complexity of user-item interactions and establish an integrated network to combine non-linear transformation with latent factors. We further design two variants of NeuRec: userbased NeuRec and item-based NeuRec, by focusing on different aspects of the interaction matrix. Extensive experiments on four real-world datasets demonstrated their superior performances on personalized ranking task.
Zhang, X, Yao, L, Huang, C, Wang, S, Tan, M, Long, G & Wang, C 2018, 'Multi-modality sensor data classification with selective attention', IJCAI International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 3111-3117.View/Download from: UTS OPUS or Publisher's site
© 2018 International Joint Conferences on Artificial Intelligence. All right reserved. Multimodal wearable sensor data classification plays an important role in ubiquitous computing and has a wide range of applications in scenarios from healthcare to entertainment. However, most existing work in this field employs domain-specific approaches and is thus ineffective in complex situations where multi-modality sensor data are collected. Moreover, the wearable sensor data are less informative than the conventional data such as texts or images. In this paper, to improve the adaptability of such classification methods across different application domains, we turn this classification task into a game and apply a deep reinforcement learning scheme to deal with complex situations dynamically. Additionally, we introduce a selective attention mechanism into the reinforcement learning scheme to focus on the crucial dimensions of the data. This mechanism helps to capture extra information from the signal and thus it is able to significantly improve the discriminative power of the classifier. We carry out several experiments on three wearable sensor datasets and demonstrate the competitive performance of the proposed approach compared to several state-of-the-art baselines.
Pan, S, Hu, R, Long, G, Jiang, J, Yao, L & Zhang, C 2018, 'Adversarially Regularized Graph Autoencoder for Graph Embedding.', IJCAI 2018, International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence, Stockholm. Sweden, pp. 2609-2615.View/Download from: UTS OPUS or Publisher's site
Graph embedding is an effective method to represent graph data in a low dimensional space for graph analytics. Most existing embedding algorithms typically focus on preserving the topological structure or minimizing the reconstruction errors of graph data, but they have mostly ignored the data distribution of the latent codes from the graphs, which often results in inferior embedding in real-world graph data. In this paper, we propose a novel adversarial graph embedding framework for graph data. The framework encodes the topological structure and node content in a graph to a compact representation, on which a decoder is trained to reconstruct the graph structure. Furthermore, the latent representation is enforced to match a prior distribution via an adversarial training scheme. To learn a robust embedding, two variants of adversarial approaches, adversarially regularized graph autoencoder (ARGA) and adversarially regularized variational graph autoencoder (ARVGA), are developed. Experimental studies on real-world graphs validate our design and demonstrate that our algorithms outperform baselines by a wide margin in link prediction, graph clustering, and graph visualization tasks.
Hu, R, Yu, CP, Fung, SF, Pan, S, Wang, H & Long, G 2017, 'Universal network representation for heterogeneous information networks', Proceedings of the International Joint Conference on Neural Networks, 2017 International Joint Conference on Neural Networks, IEEE, Anchorage, AK, USA, pp. 388-395.View/Download from: UTS OPUS or Publisher's site
© 2017 IEEE. Network representation aims to represent the nodes in a network as continuous and compact vectors, and has attracted much attention in recent years due to its ability to capture complex structure relationships inside networks. However, existing network representation methods are commonly designed for homogeneous information networks where all the nodes (entities) of a network are of the same type, e.g., papers in a citation network. In this paper, we propose a universal network representation approach (UNRA), that represents different types of nodes in heterogeneous information networks in a continuous and common vector space. The UNRA is built on our latest mutually updated neural language module, which simultaneously captures inter-relationship among homogeneous nodes and node-content correlation. Relationships between different types of nodes are also assembled and learned in a unified framework. Experiments validate that the UNRA achieves outstanding performance, compared to six other state-of-the-art algorithms, in node representation, node classification, and network visualization. In node classification, the UNRA achieves a 3% to 132% performance improvement in terms of accuracy.
Wang, C, Pan, S, Long, G, Zhu, X & Jiang, J 2017, 'Mgae: Marginalized graph autoencoder for graph clustering', Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, ACM, pp. 889-898.View/Download from: UTS OPUS
Hu, R, Pan, S, Jiang, J & Long, G 2017, 'Graph Ladder Networks for Network Classification', CIKM 2017: ACM International Conference on Information and Knowledge Management, ACM, pp. 2103-2106.View/Download from: UTS OPUS
Bai, Y, Wang, H, Wu, J, Zhang, Y, Jiang, J & Long, G 2016, 'Evolutionary lazy learning for Naive Bayes classification', 2016 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 3124-3129.View/Download from: UTS OPUS
Chang, X, Yang, Y, Long, G, Zhang, C & Hauptmann, AG 2016, 'Dynamic concept composition for zero-example event detection', Proceedings of 30th AAAI Conference on Artificial Intelligence, AAAI 2016, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence, Phoenix, Arizona, United States, pp. 3464-3470.View/Download from: UTS OPUS
© Copyright 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.In this paper, we focus on automatically detecting events in unconstrained videos without the use of any visual training exemplars. In principle, zero-shot learning makes it possible to train an event detection model based on the assumption that events (e.g. birthday party) can be described by multiple mid-level semantic concepts (e.g. "blowing candle", "birthday cake"). Towards this goal, we first pre-Train a bundle of concept classifiers using data from other sources. Then we evaluate the semantic correlation of each concept w.r.t. the event of interest and pick up the relevant concept classifiers, which are applied on all test videos to get multiple prediction score vectors. While most existing systems combine the predictions of the concept classifiers with fixed weights, we propose to learn the optimal weights of the concept classifiers for each testing video by exploring a set of online available videos with freeform text descriptions of their content. To validate the effectiveness of the proposed approach, we have conducted extensive experiments on the latest TRECVID MEDTest 2014, MEDTest 2013 and CCV dataset. The experimental results confirm the superiority of the proposed approach.
Yan, Y, Xu, Z, Tsang, W, Long, G & Yang, Y 2016, 'Robust Semi-supervised Learning through Label Aggregation', Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), AAAI Conference on Artificial Intelligence, AAAI, Phoenix, USA, pp. 2244-2250.View/Download from: UTS OPUS
Semi-supervised learning is proposed to exploit both labeled
and unlabeled data. However, as the scale of data in real
world applications increases significantly, conventional semisupervised
algorithms usually lead to massive computational
cost and cannot be applied to large scale datasets. In addition,
label noise is usually present in the practical applications
due to human annotation, which very likely results in remarkable
degeneration of performance in semi-supervised methods.
To address these two challenges, in this paper, we propose
an efficient RObust Semi-Supervised Ensemble Learning
(ROSSEL) method, which generates pseudo-labels for
unlabeled data using a set of weak annotators, and combines
them to approximate the ground-truth labels to assist semisupervised
learning. We formulate the weighted combination
process as a multiple label kernel learning (MLKL) problem
which can be solved efficiently. Compared with other semisupervised
learning algorithms, the proposed method has linear
time complexity. Extensive experiments on five benchmark
datasets demonstrate the superior effectiveness, effi-
ciency and robustness of the proposed algorithm.
Zhang, Q, Wu, J, Yang, H, Lu, W, Long, G & Zhang, C 2016, 'Global and local influence-based social recommendation', International Conference on Information and Knowledge Management, Proceedings, ACM International Conference on Information and Knowledge Management, ACM, Indianapolis, USA, pp. 1917-1920.View/Download from: UTS OPUS or Publisher's site
© 2016 ACM.Social recommendation has been widely studied in recent years. Existing social recommendation models use various explicit pieces of social information as regularization terms, e.g., social links are considered as new constraints. However, social influence, an implicit source of information in social networks, is seldomly considered, even though it often drives recommendations in social networks. In this paper, we introduce a new global and local influence-based social recommendation model. Based on the observation that user purchase behaviour is influenced by both global influential nodes and the local influential nodes of the user, we formulate the global and local influence as an regularization terms, and incorporate them into a matrix factorization-based recommendation model. Experimental results on large data sets demonstrate the performance of the proposed method.
Zhang, Q, Wu, J, Zhang, P, Long, G, Tsang, IW & Zhang, C 2016, 'Inferring latent network from cascade data for dynamic social recommendation', Proceedings - IEEE International Conference on Data Mining, ICDM, IEEE International Conference on Data Mining, IEEE, Barcelona, Spain, pp. 669-678.View/Download from: UTS OPUS or Publisher's site
© 2016 IEEE. Social recommendation explores social information to improve the quality of a recommender system. It can be further divided into explicit and implicit social network recommendation. The former assumes the existence of explicit social connections between users in addition to the rating data. The latter one assumes the availability of only the ratings but not the social connections between users since the explicit social information data may not necessarily be available and usually are binary decision values (e.g., whether two people are friends), while the strength of their relationships is missing. Most of the works in this field use only rating data to infer the latent social networks. They ignore the dynamic nature of users that the preferences of users drift over time distinctly. To this end, we propose a new Implicit Dynamic Social Recommendation (IDSR) model, which infers latent social network from cascade data. It can sufficiently mine the information contained in time by mining the cascade data and identify the dynamic changes in the users in time by using the latest updated social network to make recommendations. Experiments and comparisons on three real-world datasets show that the proposed model outperforms the state-of-The-Art solutions in both explicit and implicit scenarios.
Zhang, Q, Zhang, P, Long, G, Ding, W, Zhang, C & Wu, X 2015, 'Towards mining trapezoidal data streams', Proceedings - IEEE International Conference on Data Mining, ICDM, IEEE International Conference on Data Mining, IEEE, Atlantic City, New Jersey, United States, pp. 1111-1116.View/Download from: UTS OPUS or Publisher's site
© 2015 IEEE.We study a new problem of learning from doubly-streaming data where both data volume and feature space increase over time. We refer to the problem as mining trapezoidal data streams. The problem is challenging because both data volume and feature space are increasing, to which existing online learning, online feature selection and streaming feature selection algorithms are inapplicable. We propose a new Sparse Trapezoidal Streaming Data mining algorithm (STSD) and its two variants which combine online learning and online feature selection to enable learning trapezoidal data streams with infinite training instances and features. Specifically, when new training instances carrying new features arrive, the classifier updates the existing features by following the passive-aggressive update rule used in online learning and updates the new features with the structural risk minimization principle. Feature sparsity is also introduced using the projected truncation techniques. Extensive experiments on the demonstrated UCI data sets show the performance of the proposed algorithms.
Zhang, Q, Zhang, Q, Long, G, Zhang, P & Zhang, C 2016, 'Exploring heterogeneous product networks for discovering collective marketing hyping behavior', Advances in Knowledge Discovery and Data Mining - LNCS, Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, Auckland, New Zealand, pp. 40-51.View/Download from: UTS OPUS or Publisher's site
© Springer International Publishing Switzerland 2016. Online spam comments often misguide users during online shopping. Existing online spam detection methods rely on semantic clues, behavioral footprints, and relational connections between users in review systems. Although these methods can successfully identify spam activities, evolving fraud strategies can successfully escape from the detection rules by purchasing positive comments from massive random users, i.e., user Cloud. In this paper, we study a new problem, Collective Marketing Hyping detection, for spam comments detection generated from the user Cloud. It is defined as detecting a group of marketing hyping products with untrustful marketing promotion behaviour. We propose a new learning model that uses heterogenous product networks extracted from product review systems. Our model aims to mining a group of hyping activities, which differs from existing models that only detect a single product with hyping activities. We show the existence of the Collective Marketing Hyping behavior in real-life networks. Experimental results demonstrate that the product information network can effectively detect fraud intentional product promotions.
Hu, R, Pan, S, Long, G, Zhu, X, Jiang, J & Zhang, C 2016, 'Co-clustering enterprise social networks', 2016 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 107-114.View/Download from: UTS OPUS
Jiang, X, Liu, W, Cao, L & Long, G 2015, 'Coupled Collaborative Filtering for Context-aware Recommendation', AAAI Publications, Twenty-Ninth AAAI Conference on Artificial Intelligence, Student Abstracts, AAAI Conference on Artificial Intelligence, AAAI, Austin Texas, USA, pp. 4172-4173.View/Download from: UTS OPUS
Context-aware features have been widely recognized as important factors in recommender systems. However, as a major technique in recommender systems, traditional Collaborative Filtering (CF) does not provide a straight-forward way of integrating the context-aware information into personal recommendation. We propose a Coupled Collaborative Filtering (CCF) model to measure the contextual information and use it to improve recommendations. In the proposed approach, coupled similarity computation is designed to be calculated by interitem, intra-context and inter-context interactions among item, user and context-ware factors. Experiments based on different types of CF models demonstrate the effectiveness of our design.
Jiang, X, Peng, X & Long, G 2015, 'Discovering sequential rental patterns by fleet tracking', Data Science (LNCS), International Conference on Data Science, Springer, Sydney, Australia, pp. 42-49.View/Download from: UTS OPUS or Publisher's site
© Springer International Publishing Switzerland 2015. As one of the most well-known methods on customer analysis, sequential pattern mining generally focuses on customer business transactions to discover their behaviors. However in the real-world rental industry, behaviors are usually linked to other factors in terms of actual equipment circumstance. Fleet tracking factors, such as location and usage, have been widely considered as important features to improve work performance and predict customer preferences. In this paper, we propose an innovative sequential pattern mining method to discover rental patterns by combining business transactions with the fleet tracking factors. A novel sequential pattern mining framework is designed to detect the effective items by utilizing both business transactions and fleet tracking information. Experimental results on real datasets testify the effectiveness of our approach.
Unankard, S, Li, X & Long, G 2015, 'Invariant event tracking on social networks', Database Systems for Advanced Applications (LNCS), Database Systems for Advanced Applications, Springer, Hanoi, Vietnam, pp. 517-521.View/Download from: Publisher's site
© 2015, Springer International Publishing Switzerland, All rights Reserved. When an event is emerging and actively discussed on social networks, its related issues may change from time to time. People may focus on different issues of an event at different times. An invariant event is an event with changing subsequent issues that last for a period of time. Examples of invariant events include government elections, natural disasters, and breaking news. This paper describes our demonstration system for tracking invariant events over social networks. Our system is able to summarize continuous invariant events and track their developments along a timeline. We propose invariant event detection by utilizing an approach of Clique Percolation Method (CPM) community mining. We also present an approach to event tracking based on the relationships between communities. The Twitter messages related to the 2013 Australian Federal Election are used to demonstrate the effectiveness of our approach. As the first of this kind, our system provides a benchmark for further development of monitoring tools for social events.
Zhang, Q, Yu, L & Long, G 2015, 'SocialTrail: Recommending Social Trajectories from Location-Based Social Networks', Databases Theory and Applications (LNCS), Australasian Database Conference, Springer International Publishing, Melbourne, VIC, Australia, pp. 314-317.View/Download from: Publisher's site
Trajectory recommendation plays an important role for travel planning. Most existing systems are mainly designed for spot recommendation without the understanding of the overall trip and tend to utilize homogeneous data only (e.g., geo-tagged images). Furthermore, they focus on the popularity of locations and fail to consider other important factors like traveling time and sequence, etc. In this paper, we propose a novel system that can not only integrate geo-tagged images and check-in data to discover meaningful social trajectories to enrich the travel information, but also take both temporal and spatial factors into consideration to make trajectory recommendation more accurately.
Jiang, J, Lu, J, Zhang, G & Long, G 2013, 'Optimal cloud resource auto-scaling for web applications', 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, IEEE, pp. 58-65.View/Download from: UTS OPUS
Long, G & Jiang, J 2013, 'Graph based feature augmentation for short and sparse text classification', International Conference on Advanced Data Mining and Applications, Springer, Berlin, Heidelberg, pp. 456-467.View/Download from: UTS OPUS
Long, G, Chen, L, Zhu, X & Zhang, C 2012, 'TCSST: transfer classification of short & sparse text using external data', Proc. Of The 21st ACM Conference on Information and Knowledge Management (CIKM-12), ACM International Conference on Information and Knowledge Management, ACM, Maui, Hawaii, USA, pp. 764-772.View/Download from: UTS OPUS or Publisher's site
Short & sparse text is becoming more prevalent on the web, such as search snippets, micro-blogs and product reviews. Accurately classifying short & sparse text has emerged as an important while challenging task. Existing work has considered utilizing external data (e.g. Wikipedia) to alleviate data sparseness, by appending topics detected from external data as new features. However, training a classifier on features concatenated from different spaces is not easy considering the features have different physical meanings and different significance to the classification task. Moreover, it exacerbates the "curse of dimensionality" problem. In this study, we propose a transfer classification method, TCSST, to exploit the external data to tackle the data sparsity issue. The transfer classifier will be learned in the original feature space. Considering that the labels of the external data may not be readily available or sufficiently enough, TCSST further exploits the unlabeled external data to aid the transfer classification. We develop novel strategies to allow TCSST to iteratively select high quality unlabeled external data to help with the classification. We evaluate the performance of TCSST on both benchmark as well as real-world data sets. Our experimental results demonstrate that the proposed method is effective in classifying very short & sparse text, consistently outperforming existing and baseline methods
Dr. Long is managing more than $2m external research grants including two ARC Linkage projects, and three Research contract projects.
His industry partners include:
1) Australia Federal Department of Health (2016 - current)
2) Coates Hire Pty Ltd (the largest Australia rental company) (2012-2015)
3) Mission Australia Pty Ltd (2014 - current)
4) Australia Research Alliance for Children & Youth (2015 - current)
5) Global Business College Australia (2015- current)
6) MakeMagic Australia Pty Ltd. (2016- current)
7) Gubei Tech Co.Ltd. (China) (2016 - Current)