Training the next generation
One objective of the Laureate project is to develop Australia’s capability in artificial intelligence, particularly machine learning research capacity, by training the next generation of researchers.
This section highlights completed and ongoing PhD research undertaken within the Laureate project that addresses challenging and emerging issues in the field. The research topics and brief summaries are outlined below.
Deep Neural Networks for Multi-Source Transfer Learning
Summary
Transfer learning is gaining incredible attention due to its ability to leverage previously acquired knowledge from source domain to assist in completing a task in a similar target domain. Many existing transfer learning methods deal with single source transfer learning, but rarely consider the fact that information from a single source can be inadequate to a target.
In addition, most transfer learning methods assume that the source and target domains share the same label space. But in practice, the source domain(s) sharing the same label space with the target domain may never be found. Third, data privacy and security are being magnificently conspicuous in real-world applications, which means the traditional transfer learning relying on data matching cannot be applied due to privacy concerns.
To solve the mentioned problems, this thesis develops a series of methods to tackle transfer learning with multiple source domains. To measure contributions of source domains, multi-source contribution learning and dynamic classifier alignment methods are developed. To define what to transfer, sample and source distillation method is proposed. To address transfer learning without the access to source data, generally auxiliary model and fuzzy rule-based model are explored under closed-set, partial and open-set settings. Finally, universal domain adaptation is exploited by designing a model which is flexible enough to multiple source domains with homogeneous and heterogeneous label spaces.
Handling Concept Drift Using The Correlation Between Multiple Data Streams
Summary
Concept Drift has been a major issue in handing streaming data in machine learning area. To date, the research on concept drift considers data streams separately, ignoring the correlations between data streams. Motivated by this, this research proposes four methods to deal with the correlations between data streams. A concept drift adaptation method is firstly proposed to overcome the insufficient training problem caused by scarce newly arrived data.
By introducing correlation between multiple data streams, a multi-stream concept drift handling framework is then proposed to deal with concept drifting for multi-stream environment. Next, Evolutionary Regressor Chains are developed to track the correlations between multiple data streams. And lastly, a concept drift adaptation strategy for neural network classifiers is also developed for circumstances in which multiple data streams have different feature spaces. Extensive experiments have been conducted to evaluate the developed methods.
Adaptive Learning under Concept Drift for Multiple Data Streams
Summary
In real-world intelligent systems, it is common to encounter scenarios where multiple data streams are generated simultaneously. Each of these data streams is prone to changes in its underlying distribution, leading to concept drift. Despite being associated with the same task, these data streams often exhibit distinct distributions due to varying data sources. Hence, there’s a rising interest in developing efficient learning techniques for analyzing multiple data streams with concept drift in non-stationary environments.
This research focuses on two main tasks under multiple data streams: i) multistream classification, which aims to predict the unlabeled target stream by adaptively transferring knowledge from labeled source streams in the face of non-stationary processes with concept drifts. ii) exploring the concept drift problem in more complex intelligent systems.
Deep Reinforcement Learning in Non-stationary Environments
Summary
Deep Reinforcement Learning has demonstrated superior performance in various domains, such as recommender systems, health operations and autonomous driving. Most traditional deep reinforcement learning can be characterized as the search for a policy that obtains the highest cumulative reward in an unknown but stationary environment with fixed state transitions and reward functions. However, this assumption does not always hold in many practical scenarios; environments are non-stationary and have abrupt and unpredictable change points in many cases.
For example, when a well-trained deep reinforcement learning policy is applied to an outdoor robot that may encounter different terrains and enter caves without lighting, the previous optimal policy may make mistakes or even fail. In these practical environments, mistakes will be made repeatedly if the algorithm does not identify the change and actively adapt to it. To address this problem, several methods are proposed in this thesis to focus on deep reinforcement learning in challenging environments with time-varying non-stationarity.
Transfer Learning with Imprecise Observations: Theory and Algorithms
Summary
In this research, we consider a new, realistic problem called transfer learning with imprecise observations (TLIMO), where the source or target domains only contain imprecise observations. To develop new theories and construct algorithms for addressing TLIMO problem in various real-world scenarios, this thesis intends to address four orthogonal problems:
- How to construct a theoretical foundation for imprecise data analysis and handle a simple problem called multi-class classification with imprecise observations (MCIMO);
- How to handle TLIMO problem in single-source domain scenario;
- How to handle the multi-source transfer learning problem when the instances in the source or target domains are imprecise; and
- How to handle the universal domain adaptation (UniDA) problem when the instances in the source or target domains are imprecise.
Autonomous Learning for Multiple Data Streams under Concept Drift
Summary
With the increasing prevalence of streaming data in real-world applications, addressing concept drift – unpredictable changes in data distributions – has become a crucial challenge for maintaining model accuracy. While traditional machine learning models assume stationary data streams, real-world streams frequently exhibit non-stationarity and concept drift – unpredictable changes in data distribution that undermine the effectiveness of conventional algorithms. Most existing studies focus on handling drift in single data streams, whereas real-world scenarios often involve multiple, interdependent streams where isolated modeling neglects important cross-stream correlations.
This thesis proposes a series of effective frameworks for Autonomous Learning for Multiple Data Streams under Concept Drift (ALMCD). Our approach explicitly incorporates the dynamic relationships among multiple data streams, enabling the model to learn and adapt to both individual and joint distribution changes. By dynamically capturing inter-stream structural changes, our frameworks can automatically detect and adapt to various types of concept drift.
Experimental results demonstrate that our methods significantly improve real-time prediction accuracy and adaptability compared to traditional single-stream approaches. These findings provide a practical solution for robust data mining in complex, non-stationary environments such as transportation networks and weather prediction systems.
Ongoing HDR Projects
- Structured Task Representations for Contextual Meta-Reinforcement Learning
- Graph Convolutional Multi-Agent Reinforcement Learning
- Enhancing Trustworthiness in Cross-Domain Recommender Systems
- Distribution Shift Detection: From Concept Drift to LLM Hallucinations