Dr Jing Jiang is currently a Lecturer at the UTS Priority Research Centre for Artificial Intelligence (CAI), School of Software, Faculty of Engineering and Information Technology (FEIT) at the University of Technology Sydney (UTS), Australia. She received a PhD degree in Information Technology from UTS in March 2015. Her research interest lies in data mining and machine learning applications with the focuses on deep reinforcement learning and sequential decision-making.
Can supervise: YES
- Data mining and machine learning applications
- Deep reinforcement learning
- Sequential decision making
- Resource optimisation
Jiang, J, Zhang, H, Pi, D & Dai, C 2019, 'A novel multi-module neural network system for imbalanced heartbeats classification', Expert Systems with Applications, vol. 1.View/Download from: UTS OPUS or Publisher's site
Jiang, X, Pan, S, Long, G, Xiong, F, Jiang, J & Zhang, C 2019, 'Cost-sensitive parallel learning framework for insurance intelligence operation', IEEE Transactions on Industrial Electronics, vol. 66, no. 12, pp. 9713-9723.View/Download from: UTS OPUS or Publisher's site
Recent advancements in artificial intelligence are providing the insurance industry with new opportunities to create tailored solutions and services based on newfound knowledge of consumers, and the execution of enhanced operations and business functions. However, insurance data are heterogeneous, and imbalanced class distribution with low frequency and high dimensions, which presents four major challenges to machine learning in real-world business. Traditional machine learning algorithms can typically apply to standard data sets, which are normally homogeneous and balanced. In this paper, we focus on an efficient cost-sensitive parallel learning framework (CPLF) to enhance insurance operations with a deep learning approach that does not require preprocessing. Our approach comprises a novel, unified, end-to-end cost-sensitive parallel neural network that learns real-world heterogeneous data. A specifically designed cost-sensitive matrix then automatically generates a robust model for learning minority classifications, and the parameters of both the cost-sensitive matrix and the hybrid neural network are alternately but jointly optimized during training. We also study the CPLF-based architecture for a real-world insurance intelligence operation system, and demonstrate fraud detection and policy renewal experiments on this system. The results of comparative experiments on real-world insurance data sets reflecting actual business cases demonstrate the effectiveness of our design.
Liu, L, Zhou, T, Long, G, Jiang, J & Zhang, C 2018, 'MahiNet: A Neural Network for Many-Class Few-Shot Learning with Class Hierarchy'.
Xu, YL, Jiang, J & Li, Z 2011, 'Cyclic Optimisation For Localisation In Freeform Surface Inspection', International Journal Of Production Research, vol. 49, no. 2, pp. 361-374.View/Download from: UTS OPUS or Publisher's site
Increasing demands on precision manufacturing of parts with freeform surfaces have been observed in the last several years. Although significant progress has been made in precision machining of freeform surfaces, inspection of such surfaces remains a dif
Many algorithms for Knowledge-Based Question Answering (KBQA) depend on
semantic parsing, which translates a question to its logical form. When only
weak supervision is provided, it is usually necessary to search valid logical
forms for model training. However, a complex question typically involves a huge
search space, which creates two main problems: 1) the solutions limited by
computation time and memory usually reduce the success rate of the search, and
2) spurious logical forms in the search results degrade the quality of training
data. These two problems lead to a poorly-trained semantic parsing model. In
this work, we propose an effective search method for weakly supervised KBQA
based on operator prediction for questions. With search space constrained by
predicted operators, sufficient search paths can be explored, more valid
logical forms can be derived, and operators possibly causing spurious logical
forms can be avoided. As a result, a larger proportion of questions in a weakly
supervised training set are equipped with logical forms, and fewer spurious
logical forms are generated. Such high-quality training data directly
contributes to a better semantic parsing model. Experimental results on one of
the largest KBQA datasets (i.e., CSQA) verify the effectiveness of our
approach: improving the precision from 67% to 72% and the recall from 67% to
72% in terms of the overall score.
For time series classification task using 1D-CNN, the selection of kernel
size is critically important to ensure the model can capture the right scale
salient signal from a long time-series. Most of the existing work on 1D-CNN
treats the kernel size as a hyper-parameter and tries to find the proper kernel
size through a grid search which is time-consuming and is inefficient. This
paper theoretically analyses how kernel size impacts the performance of 1D-CNN.
Considering the importance of kernel size, we propose a novel Omni-Scale 1D-CNN
(OS-CNN) architecture to capture the proper kernel size during the model
learning period. A specific design for kernel size configuration is developed
which enables us to assemble very few kernel-size options to represent more
receptive fields. The proposed OS-CNN method is evaluated using the UCR archive
with 85 datasets. The experiment results demonstrate that our method is a
stronger baseline in multiple performance indicators, including the critical
difference diagram, counts of wins, and average accuracy. We also published the
experimental source codes at GitHub (https://github.com/Wensi-Tang/OS-CNN/).
Ji, S, Long, G, Pan, S, Zhu, T, Jiang, J, Wang, S & Li, X, 'Knowledge Transferring via Model Aggregation for Online Social Care'.
The Internet and the Web are being increasingly used in proactive social care
to provide people, especially the vulnerable, with a better life and services,
and their derived social services generate enormous data. However, the strict
protection of privacy makes user's data become an isolated island and limits
the predictive performance of standalone clients. To enable effective proactive
social care and knowledge sharing within intelligent agents, this paper
develops a knowledge transferring framework via model aggregation. Under this
framework, distributed clients perform on-device training, and a third-party
server integrates multiple clients' models and redistributes to clients for
knowledge transferring among users. To improve the generalizability of the
knowledge sharing, we further propose a novel model aggregation algorithm,
namely the average difference descent aggregation (AvgDiffAgg for short). In
particular, to evaluate the effectiveness of the learning algorithm, we use a
case study on the early detection and prevention of suicidal ideation, and the
experiment results on four datasets derived from social communities demonstrate
the effectiveness of the proposed learning method.
Li, Y, Long, G, Shen, T, Zhou, T, Yao, L, Huo, H & Jiang, J, 'Self-Attention Enhanced Selective Gate with Entity-Aware Embedding for Distantly Supervised Relation Extraction'.
Distantly supervised relation extraction intrinsically suffers from noisy
labels due to the strong assumption of distant supervision. Most prior works
adopt a selective attention mechanism over sentences in a bag to denoise from
wrongly labeled data, which however could be incompetent when there is only one
sentence in a bag. In this paper, we propose a brand-new light-weight neural
framework to address the distantly supervised relation extraction problem and
alleviate the defects in previous selective attention framework. Specifically,
in the proposed framework, 1) we use an entity-aware word embedding method to
integrate both relative position information and head/tail entity embeddings,
aiming to highlight the essence of entities for this task; 2) we develop a
self-attention mechanism to capture the rich contextual dependencies as a
complement for local dependencies captured by piecewise CNN; and 3) instead of
using selective attention, we design a pooling-equipped gate, which is based on
rich contextual representations, as an aggregator to generate bag-level
representation for final relation classification. Compared to selective
attention, one major advantage of the proposed gating mechanism is that, it
performs stably and promisingly even if only one sentence appears in a bag and
thus keeps the consistency across all training examples. The experiments on NYT
dataset demonstrate that our approach achieves a new state-of-the-art
performance in terms of both AUC and top-n precision metrics.
Ji, S, Pan, S, Long, G, Li, X, Jiang, J & Huang, Z 2019, 'Learning Private Neural Language Modeling with Attentive Aggregation', The 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.View/Download from: UTS OPUS
Peng, X, Long, G, Pan, S, Jiang, J & Niu, Z 2019, 'Attentive Dual Embedding for Understanding Medical Concept in Electronic Health Record', The 2019 International Joint Conference on Neural Networks, Budapest, Hungary.View/Download from: UTS OPUS
Electronic health records contain a wealth of information on a patient’s healthcare over many visits, such as diagnoses, treatments, drugs administered, and so on. The untapped potential of these data in healthcare analytics is vast. However, given that much of medical information is a cause and effect science, new embedding methods are required to ensure the learning representations reflect the comprehensive interplays between medical concepts and their relationships over time. Unlike one-hot encoding, a distributed representation should preserve these complex interactions as high-quality inputs for machine learning-based healthcare analytics tasks. Therefore, we propose a novel attentive dual embedding method called MC2Vec. MC2Vec captures the proximity relationships between medical concepts through a two-step optimization framework that recursively refines the embedding for superior output. The framework comprises a Skip-gram model to generate the initial embedding and an attentive CBOW model to fine-tune the embedding with temporal information gleaned from sequences of patient visits. Experiments with two public datasets demonstrate that MC2Vec’s produces embeddings of higher quality than five state-of-the-art methods.
Peng, X, Long, G, Pan, S, Jiang, J & Niu, Z 2019, 'Attentive Dual Embedding for Understanding Medical Concept in Electronic Health Record', The 2019 International Joint Conference on Neural Networks (IJCNN 2019), Budapest, Hungary.View/Download from: UTS OPUS
Peng, X, Long, G, Shen, T, Wang, S, Jiang, J & Blumenstein, M 2019, 'Temporal Self-Attention Network for Medical Concept Embedding', 19th IEEE International Conference on Data Mining (ICDM), International Conference on Data Mining, Beijing, China.View/Download from: UTS OPUS
In longitudinal electronic health records (EHRs), the event records of a patient are distributed over a long period of time and the temporal relations between the events reflect sufficient domain knowledge to benefit prediction tasks such as the rate of inpatient mortality. Medical concept embedding as a feature extraction method that transforms a set of medical concepts with a specific time stamp into a vector, which will be fed into a supervised learning algorithm. The quality of the embedding significantly determines the learning performance over the medical data. In this paper, we propose a medical concept embedding method based on applying a self-attention mechanism to represent each medical concept. We propose a novel attention mechanism which captures the contextual information and temporal relationships between medical concepts. A light-weight neural net, “Temporal Self-Attention Network (TeSAN)”, is then proposed to learn medical concept embedding based solely on the proposed attention mechanism. To test the effectiveness of our proposed methods, we have conducted clustering and prediction tasks on two public EHRs datasets comparing TeSAN against five state-of-the-art embedding methods. The experimental results demonstrate that the proposed TeSAN model is superior to all the compared methods. To the best of our knowledge, this work is the first to exploit temporal self-attentive relations between medical events.
Tao Shen, JJ & Zhang, C 2019, 'Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together', 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics(NAACL-2019).
Di Wu, MB 2019, 'Feature-Dependent Graph Convolutional Autoencoders with Adversarial Training Methods', The 2019 International Joint Conference on Neural Networks (IJCNN 2019).
Shaoxiong Ji, ZH 2019, 'Learning Private Neural Language Modeling with Attentive Aggregation', The 2019 International Joint Conference on Neural Networks (IJCNN 2019).
Ji, S, Long, G, Pan, S, Zhu, T, Jiang, J & Wang, S 2918, 'Detecting Suicidal Ideation with Data Protection in Online Communities', Database Systems for Advanced Applications (LNCS), International Conference on Database Systems for Advanced Applications, Springer, Chiang Mai, Thailand, pp. 225-229.View/Download from: Publisher's site
© 2019, Springer Nature Switzerland AG. Recent advances in Artificial Intelligence empower proactive social services that use virtual intelligent agents to automatically detect people’s suicidal ideation. Conventional machine learning methods require a large amount of individual data to be collected from users’ Internet activities, smart phones and wearable healthcare devices, to amass them in a central location. The centralized setting arises significant privacy and data misuse concerns, especially where vulnerable people are concerned. To address this problem, we propose a novel data-protecting solution to learn a model. Instead of asking users to share all their personal data, our solution is to train a local data-preserving model for each user which only shares their own model’s parameters with the server rather than their personal information. To optimize the model’s learning capability, we have developed a novel updating algorithm, called average difference descent, to aggregate parameters from different client models. An experimental study using real-world online social community datasets has been included to mimic the scenario of private communities for suicide discussion. The results of experiments demonstrate the effectiveness of our technology solution and paves the way for mental health service providers to apply this technology to real applications.
CHEN, F, Pan, S, Jiang, J, Huo, H & Long, G 2019, 'DAGCN: Dual Attention Graph Convolutional Networks', The 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.View/Download from: UTS OPUS
Fengwen Chen, GL 2019, 'DAGCN: Dual Attention Graph Convolutional Networks', The 2019 International Joint Conference on Neural Networks (IJCNN 2019).
Liu, L, Zhou, T, Long, G, Yao, L, Jiang, J & Zhang, C 2019, 'Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph', The 28th International Joint Conference on Artificial Intelligence (IJCAI), Macao, China.View/Download from: UTS OPUS
Shen, T, Zhou, T, Long, G, Jiang, J & Zhang, C 2019, 'Tensorized Self-Attention: Efficiently Modeling Pairwise and Global Dependencies Together', (NAACL) Annual Conference of the North American Chapter of the Association for Computational Linguisticss, Minneapolis, USA.View/Download from: UTS OPUS
Wang, C, Pan, S, Hu, R, Long, G, Jiang, J & Zhang, C 2019, 'Attributed Graph Clustering: A Deep Attentional Embedding Approach', The 28th International Joint Conference on Artificial Intelligence (IJCAI), Macao, China.
Jiang, X, Pan, S, Jiang, J & Long, G 2018, 'Cross-domain deep learning approach for multiple financial market prediction', 2018 International Joint Conference on Neural Networks (IJCNN), International Joint Conference on Neural Networks, IEEE, Rio de Janeiro, Brazil, pp. 1-8.View/Download from: UTS OPUS or Publisher's site
Over recent decades, globalization has resulted in a steady increase in cross-border financial flows around the world. To build an abstract representation of a real-world financial market situation, we structure the fundamental influences among homogeneous and heterogeneous markets with three types of correlations: the inner-domain correlation between homogeneous markets in various countries, the cross-domain correlation between heterogeneous markets, and the time-series correlation between current and past markets. Such types of correlations in global finance challenge traditional machine learning approaches due to model complexity and nonlinearity. In this paper, we propose a novel cross-domain deep learning approach (Cd-DLA) to learn real-world complex correlations for multiple financial market prediction. Based on recurrent neural networks, which capture the time-series interactions in financial data, our model utilizes the attention mechanism to analyze the inner-domain and cross-domain correlations, and then aggregates all of them for financial forecasting. Experiment results on ten-year financial data on currency and stock markets from three countries prove the performance of our approach over other baselines.
Shen, T, Zhou, T, Long, G, Jiang, J & Zhang, C 2018, 'Fast Directional Self-Attention Mechanism', arXiv preprint arXiv:1805.00912.
Jiang, X, Pan, S, Long, G, Chang, J, Jiang, J & Zhang, C 2018, 'Cost-sensitive hybrid neural networks for heterogeneous and imbalanced data', 2018 International Joint Conference on Neural Networks (IJCNN), International Joint Conference on Neural Networks, IEEE, Rio de Janeiro, Brazil, pp. 1-8.View/Download from: UTS OPUS or Publisher's site
Analyzing accumulated data has recently attracted huge attention for its ability to generate values by identifying useful information and providing an edge in global business competition. However, heterogeneous data and imbalanced class distribution present two major challenges to machine learning with real-world business data. Traditional machine learning algorithms can typically only be applied to standard data sets, which are normally homogeneous and balanced. These algorithms narrow complex data into a homogeneous, a balanced data space an inefficient process that requires a significant amount of pre-processing. In this paper, we focus on an efficient solution to the challenges with heterogeneous and imbalanced data sets that does not require pre-processing. Our approach comprises a novel, unified, end-to-end cost-sensitive hybrid neural network that learns real-world heterogeneous data via a parallel network architecture. A specifically-designed cost-sensitive matrix then automatically generates a robust model for learning minority classifications. And the parameters of both the cost-sensitive matrix and the hybrid neural network are alternately but jointly optimized during training. The results of comparative experiments on six real-world data sets reflecting actual business cases, including insurance fraud detection and mobile customer demographics, indicate that the proposed approach demonstrates superior performance over baseline procedures.
Pan, S, Hu, R, Long, G, Jiang, J, Yao, L & Zhang, C 2018, 'Adversarially Regularized Graph Autoencoder for Graph Embedding.', IJCAI 2018, International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence, Stockholm. Sweden, pp. 2609-2615.View/Download from: UTS OPUS or Publisher's site
Graph embedding is an effective method to represent graph data in a low dimensional space for graph analytics. Most existing embedding algorithms typically focus on preserving the topological structure or minimizing the reconstruction errors of graph data, but they have mostly ignored the data distribution of the latent codes from the graphs, which often results in inferior embedding in real-world graph data. In this paper, we propose a novel adversarial graph embedding framework for graph data. The framework encodes the topological structure and node content in a graph to a compact representation, on which a decoder is trained to reconstruct the graph structure. Furthermore, the latent representation is enforced to match a prior distribution via an adversarial training scheme. To learn a robust embedding, two variants of adversarial approaches, adversarially regularized graph autoencoder (ARGA) and adversarially regularized variational graph autoencoder (ARVGA), are developed. Experimental studies on real-world graphs validate our design and demonstrate that our algorithms outperform baselines by a wide margin in link prediction, graph clustering, and graph visualization tasks.
Shen, T, Zhou, T, Long, G, Jiang, J & Zhang, C 2018, 'Bi-directional block self-attention for fast and memory-efficient sequence modeling', International Conference on Representation Learning, Vancouver CANADA.View/Download from: UTS OPUS
Shen, T, Zhou, T, Long, G, Jiang, J, Pan, S & Zhang, C 2018, 'DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding', Proceedings of the 32nd AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, AAAI, New Orleans, USA, pp. 5446-5455.View/Download from: UTS OPUS
Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively. Attention mechanisms have recently attracted enormous interest due to their highly parallelizable computation, significantly less training time, and flexibility in modeling dependencies. We propose a novel attention mechanism in which the attention between elements from input sequence(s) is directional and multi-dimensional (i.e., feature-wise). A light-weight neural net, "Directional Self-Attention Network (DiSAN)," is then proposed to learn sentence embedding, based solely on the proposed attention without any RNN/CNN structure. DiSAN is only composed of a directional self-attention with temporal order encoded, followed by a multi-dimensional attention that compresses the sequence into a vector representation. Despite its simple form, DiSAN outperforms complicated RNN models on both prediction quality and time efficiency. It achieves the best test accuracy among all sentence encoding methods and improves the most recent best result by 1.02% on the Stanford Natural Language Inference (SNLI) dataset, and shows state-of-the-art test accuracy on the Stanford Sentiment Treebank (SST), Multi-Genre natural language inference (MultiNLI), Sentences Involving Compositional Knowledge (SICK), Customer Review, MPQA, TREC question-type classification and Subjectivity (SUBJ) datasets.
Shen, T, Zhou, T, Long, G, Jiang, J, Wang, S & Zhang, C 2018, 'Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling', IJCAI 2018, International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 4345-4352.View/Download from: UTS OPUS or Publisher's site
Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called "reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, "reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both the Stanford Natural Language Inference (SNLI) and the Sentences Involving Compositional Knowledge (SICK) datasets.
Hu, R, Pan, S, Jiang, J & Long, G 2017, 'Graph Ladder Networks for Network Classification', CIKM 2017: ACM International Conference on Information and Knowledge Management, ACM, pp. 2103-2106.View/Download from: UTS OPUS
Wang, C, Pan, S, Long, G, Zhu, X & Jiang, J 2017, 'Mgae: Marginalized graph autoencoder for graph clustering', Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, ACM, pp. 889-898.View/Download from: UTS OPUS
Bai, Y, Wang, H, Wu, J, Zhang, Y, Jiang, J & Long, G 2016, 'Evolutionary lazy learning for Naive Bayes classification', 2016 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 3124-3129.View/Download from: UTS OPUS
Hu, R, Pan, S, Long, G, Zhu, X, Jiang, J & Zhang, C 2016, 'Co-clustering enterprise social networks', 2016 International Joint Conference on Neural Networks (IJCNN), IEEE, pp. 107-114.View/Download from: UTS OPUS
Long, G & Jiang, J 2013, 'Graph based feature augmentation for short and sparse text classification', International Conference on Advanced Data Mining and Applications, Springer, Berlin, Heidelberg, pp. 456-467.View/Download from: UTS OPUS
Jiang, J, Lu, J, Zhang, G & Long, G 2013, 'Optimal cloud resource auto-scaling for web applications', 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, IEEE, pp. 58-65.View/Download from: UTS OPUS
Jiang, J, Lu, J & Zhang, G 2011, 'An innovative self-adaptive configuration optimization system in cloud computing', Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth International Conference on, IEEE, pp. 621-627.View/Download from: UTS OPUS