Dr Shaowu Liu is a postdoctoral research fellow in the School of Computer Science and Advanced Analytics Institute, University of Technology Sydney. His current research focuses on User Behavior Analytics, Interpretable Machine Learning, and Representation Learning of Knowledge Graphs. Besides research, he is also a data scientist with experiences in FinTech, Renewable Energy, and Digital Health projects sponsored by companies and state governments.
Can supervise: YES
- User Behavior Analytics
- Interpretable Machine Learning
- Representation Learning of Knowledge Graphs
Dr Shaowu Liu teaches various computer science and data science subjects at UTS and previously at Deakin University:
- Analytics Capstone Project (2017,2018,2019) @ University of Technology Sydney
- Modern Data Science (2016) @ Deakin University
- Introduction to Computer Science (2016) @ Deakin University
- Enterprise Business Intelligence (2015) @ Deakin University
- Multimedia Delivery Systems (2014) @ Deakin University
- Multimedia Systems and Technology (2014, 2015) @ Deakin University
- Database and Information Retrieval (2014, 2015) @ Deakin University
- Data Structures and Algorithms (2013, 2014) @ Deakin University
Vo, NNY, He, X, Liu, S & Xu, G 2019, 'Deep learning for decision making and the optimization of socially responsible investments and portfolio', Decision Support Systems, vol. 124.View/Download from: Publisher's site
© 2019 Elsevier B.V. A socially responsible investment portfolio takes into consideration the environmental, social and governance aspects of companies. It has become an emerging topic for both financial investors and researchers recently. Traditional investment and portfolio theories, which are used for the optimization of financial investment portfolios, are inadequate for decision-making and the construction of an optimized socially responsible investment portfolio. In response to this problem, we introduced a Deep Responsible Investment Portfolio (DRIP) model that contains a Multivariate Bidirectional Long Short-Term Memory neural network, to predict stock returns for the construction of a socially responsible investment portfolio. The deep reinforcement learning technique was adapted to retrain neural networks and rebalance the portfolio periodically. Our empirical data revealed that the DRIP framework could achieve competitive financial performance and better social impact compared to traditional portfolio models, sustainable indexes and funds.
Yan, Z, Liu, J & Liu, S 2019, 'DPWeVote: differentially private weighted voting protocol for cloud-based decision-making', Enterprise Information Systems, vol. 13, no. 2, pp. 236-256.View/Download from: Publisher's site
© 2018, © 2018 Informa UK Limited, trading as Taylor & Francis Group. With the advent of Industry 4.0, cloud computing techniques have been increasingly adopted by industry practitioners to achieve better workflows. One important application is cloud-based decision-making, in which multiple enterprise partners need to arrive an agreed decision. Such cooperative decision-making problem is sometimes formed as a weighted voting game, in which enterprise partners express 'YES/NO' opinions. Nevertheless, existing cryptographic approaches to Cloud-Based Weighted Voting Game have restricted collusion tolerance and heavily rely on trusted servers, which are not always available. In this work, we consider the more realistic scenarios of having semi-honest cloud server/partners and assuming maximal collusion tolerance. To resolve the privacy issues in such scenarios, the DPWeVote protocol is proposed which incorporates Randomized Response technique and consists the following three phases: the Randomized Weights Collection phase, the Randomized Opinions Collection phase, and the Voting Results Release phase. Experiments on synthetic data have demonstrated that the proposed DPWeVote protocol managed to retain an acceptable utility for decision-making while preserving privacy in semi-honest environment.
© 2018 John Wiley & Sons, Ltd. Jaccard Similarity has been widely used to measure the distance between two sets (or preference profiles) owned by two different users. Yet, in the private data collection scenario, it requires the untrusted curator could only estimate an approximately accurate Jaccard similarity of the involved users but without being allowed to access their preference profiles. This paper aims to address the above requirements by considering the local differential privacy model. To achieve this, we initially focused on a particular hash technique, MinHash, which was originally invented to estimate the Jaccard similarity efficiently. We designed the PrivMin algorithm to achieve the perturbation of MinHash signature by adopting Exponential mechanism and build the Locally Differentially Private Jaccard Similarity Estimation (LDP-JSE) protocol for allowing the untrusted curator to approximately estimate Jaccard similarity. Theoretical and empirical results demonstrate that the proposed protocol can retain a highly acceptable utility of the estimated similarity as well as preserving privacy.
© 2016 The Author(s) A preference relation-based Top-N recommendation approach is proposed to capture both second-order and higher-order interactions among users and items. Traditionally Top-N recommendation was achieved by predicting the item ratings first, and then inferring the item rankings, based on the assumption of availability of explicit feedback such as ratings, and the assumption that optimizing the ratings is equivalent to optimizing the item rankings. Nevertheless, both assumptions are not always true in real world applications. The proposed approach drops these assumptions by exploiting preference relations, a more practical user feedback. Furthermore, the proposed approach enjoys the representational power of Markov Random Fields thus side information such as item and user attributes can be easily incorporated. Comparing to related work, the proposed approach has the unique property of modeling both second-order and higher-order interactions among users and items. To the best of our knowledge, this is the first time both types of interactions have been captured in preference-relation based methods. Experimental results on public datasets demonstrate that both types of interactions have been properly captured, and significantly improved Top-N recommendation performance has been achieved.
Moonsamy, V, Rong, J & Liu, S 2014, 'Mining permission patterns for contrasting clean and malicious android applications', FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, vol. 36, pp. 122-132.View/Download from: Publisher's site
Liu, S, Law, R, Rong, J, Li, G & Hall, J 2013, 'Analyzing changes in hotel customers' expectations by trip mode', INTERNATIONAL JOURNAL OF HOSPITALITY MANAGEMENT, vol. 34, pp. 359-371.View/Download from: Publisher's site
Huy, QV, Li, G, Sukhorukova, NS, Beliakov, G, Liu, S, Philippe, C, Amiel, H & Ugon, A 2012, 'K-Complex Detection Using a Hybrid-Synergic Machine Learning Method', IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, vol. 42, no. 6, pp. 1478-1490.View/Download from: Publisher's site
Vu, HQ, Liu, S, Yang, X, Li, Z & Ren, Y 2012, 'Identifying microphone from noisy recordings by using representative instance one class-classification approach', Journal of Networks, vol. 7, no. 6, pp. 908-917.View/Download from: Publisher's site
Rapid growth of technical developments has created huge challenges for microphone forensics - a subcategory of audio forensic science, because of the availability of numerous digital recording devices and massive amount of recording data. Demand for fast and efficient methods to assure integrity and authenticity of information is becoming more and more important in criminal investigation nowadays. Machine learning has emerged as an important technique to support audio analysis processes of microphone forensic practitioners. However, its application to real life situations using supervised learning is still facing great challenges due to expensiveness in collecting data and updating system. In this paper, we introduce a new machine learning approach which is called One-class Classification (OCC) to be applied to microphone forensics; we demonstrate its capability on a corpus of audio samples collected from several microphones. In addition, we propose a representative instance classification framework (RICF) that can effectively improve performance of OCC algorithms for recording signal with noise. Experiment results and analysis indicate that OCC has the potential to benefit microphone forensic practitioners in developing new tools and techniques for effective and efficient analysis. © 2012 Academy Publisher.
Liu, S & Li, G 2018, 'Personalized Hotel Recommendation based on Social Networks' in Gursoy, D (ed), Routledge Handbook of Hospitality Marketing, Routledge, UK.
Recommender systems have become an important tool for users to identify interesting items as well as for businesses to promote their products to the right users. With the rapid development of social networks, travelers have started to seek recommendations and advice from web services such as TripAdvisor and Yelp. Although the initial purpose of travelers is to share their opinions on social networks, this provides an opportunity for hospitality businesses to learn about their customers' preferences. Given these data on preferences, recent advances in data science research have made it possible to build automatic recommender systems that can generate hotel recommendations tailored to each traveler. This chapter introduces the basic concepts and tools for creating hotel recommender systems
Beliakov, G & Liu, S 2014, 'Parallel Monotone Spline Interpolation and Approximation on GPUs' in Couturier, R (ed), Designing Scientific Applications on GPUs, CRC Press, USA, pp. 295-310.View/Download from: Publisher's site
Monotonicity preserving interpolation and approximation have received
substantial attention in the last thirty years because of their numerous applications in computer aided-design, statistics, and machine learning [9, 10, 19]. Constrained splines are particularly popular because of their
exibility in modeling di erent geometrical shapes, sound theoretical properties, and availability of numerically stable algorithms [9,10,26]. In this work we examine parallelization and adaptation for GPUs of a few algorithms of monotone spline interpolation and data smoothing, which arose in the context of estimating probability distributions.
Estimating Cumulative Probability distribution Functions (CDF) from
data is quite common in data analysis. In our particular case we faced this
problem in the context of partitioning univariate data with the purpose of
e cient sorting. It was necessary to partition large data sets into chunks of
approximately equal size, so that these chunks could be sorted independently and subsequently concatenated. In order to do that, empirical CDF of the data was used to nd the quantiles, which served to partition the data. CDF was estimated from the data based on a number of pairs (xi; yi); i = 1; : : : ; n, where yi was the proportion of data no larger than xi. As data could come from a variety of distributions, a distribution-free nonparametric fitting procedure was required to interpolate the above pairs. Needless to say the whole process was aimed at GPU, and hence the use of CPU for invoking serial algorithms had to be minimized.
Biddle, R, Joshi, A, Liu, S, Paris, C & Xu, G 2020, 'Leveraging Sentiment Distributions to Distinguish Figurative from Literal Health Reports on Twitter', The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020, pp. 1217-1227.View/Download from: Publisher's site
© 2020 ACM. Harnessing data from social media to monitor health events is a promising avenue for public health surveillance. A key step is the detection of reports of a disease (referred to as ĝ€?health mention classification') amongst tweets that mention disease words. Prior work shows that figurative usage of disease words may prove to be challenging for health mention classification. Since the experience of a disease is associated with a negative sentiment, we present a method that utilises sentiment information to improve health mention classification. Specifically, our classifier for health mention classification combines pre-trained contextual word representations with sentiment distributions of words in the tweet. For our experiments, we extend a benchmark dataset of tweets for health mention classification, adding over 14k manually annotated tweets across diseases. We also additionally annotate each tweet with a label that indicates if the disease words are used in a figurative sense. Our classifier outperforms current SOTA approaches in detecting both health-related and figurative tweets that mention disease words. We also show that tweets containing disease words are mentioned figuratively more often than in a health-related context, proving to be challenging for classifiers targeting health-related tweets.
Wang, X, Li, Q, Zhang, W, Xu, G, Liu, S & Zhu, W 2020, 'Joint Relational Dependency Learning for Sequential Recommendation', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 168-180.View/Download from: Publisher's site
© Springer Nature Switzerland AG 2020. Sequential recommendation leverages the temporal information of users' transactions as transition dependencies for better inferring user preference, which has become increasingly popular in academic research and practical applications. Short-term transition dependencies contain the information of partial item orders, while long-term transition dependencies infer long-range user preference, the two dependencies are mutually restrictive and complementary. Although some work investigates unifying both long-term and short-term dependencies for better performance, they still neglect the fact that short-term interactions are multi-folds, which are either individual-level interactions or union-level interactions. Existing sequential recommendations mainly focus on user's individual (i.e., individual-level) interactions but ignore the important collective influence at union-level. Since union-level interactions can reflect that human decisions are made based on multiple items he/she has already interacted, ignoring such interactions can result in the disability of capturing the collective influence between items. To alleviate this issue, we proposed a Joint Relational Dependency learning (JRD-L) for sequential recommendation that exploits both long-term and short-term preferences at individual-level and union-level. Specifically, JRD-L combines long-term user preferences with short-term interests by measuring short-term pair relations at individual-level and union-level. Moreover, JRD-L can alleviate the sparsity problem of union-level interactions by adding more descriptive details to each item, which is carried by individual-level relations. Extensive numerical experiments demonstrate JRD-L outperforms state-of-the-art baselines for the sequential recommendation.
Yin, J, Liu, S, Li, Q & Xu, G 2019, 'Prediction and Analysis of Rumour's Impact on Social Media', BESC 2019 - 6th International Conference on Behavioral, Economic and Socio-Cultural Computing, Proceedings, International Conference on Behavioral, Economic and Socio-Cultural Computing, IEEE, Beijing, China.View/Download from: Publisher's site
© 2019 IEEE. Rumour, as an important form of social communication, has been run through the whole evolutionary history of mankind. People maliciously disseminate rumours in order to increase awareness, slander others or cause panic, etc. To eliminate this issue, many researchers resort to detecting rumours on social media. However, rumour detection is not sufficient to eliminate the negative impact, which also requires official institutions to provide the refutations. In practice, the number of rumours on social media is too large, there is no need to refute some rumours with little or no concern. Therefore, we need to evaluate the impact of the rumours in advance. In this paper, we devise a rumour influence prediction model RISM (Rumour Impact on Social Media) based on a popular rumour intensity formula to predict the impact of a newborn rumour. Extensive numerical experiments are carried out on the real rumour data that are collected from Toutiao.com, which demonstrate the effectiveness of our proposed RISM model.
Zhao, M, Shu, Y, Liu, S & Xu, G 2019, 'Electricity Price Forecast using Meteorology data: A study in Australian Energy Market', BESC 2019 - 6th International Conference on Behavioral, Economic and Socio-Cultural Computing, Proceedings, International Conference on Behavioral, Economic and Socio-Cultural Computing, IEEE, Beijing, China.View/Download from: Publisher's site
© 2019 IEEE. Electricity price as a fundamental cost for each family which is an essential segment in the electricity market. The adjustment of electricity price can present the change in electricity supply and demand relationship. For the electricity supply companies, an appropriate defined electricity price can eventually determine the level of profit. On the other hand, an accurate prediction can help to seize opportunities in the electricity market. In this paper, we aim to predict the electricity price with more confident accuracy by leveraging data mining techniques. Our experiment on 12 months of electricity prices as well as climate data in the New South Wales has achieved a promising prediction result.
Zhou, Z, Liu, S, Xu, G & Zhang, W 2019, 'On Completing Sparse Knowledge Base with Transitive Relation Embedding', Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, AAAI Press, Honolulu, Hawaii USA, pp. 3125-3132.View/Download from: Publisher's site
Multi-relation embedding is a popular approach to knowledge base completion that learns embedding representations of entities and relations to compute the plausibility of missing triplet. The effectiveness of embedding approach depends on the sparsity of KB and falls for infrequent entities that only appeared a few times. This paper addresses this issue by proposing a new model exploiting the entity-independent transitive relation patterns, namely Transitive Relation Embedding (TRE). The TRE model alleviates the sparsity problem for predicting on infrequent entities while enjoys the generalisation power of embedding. Experiments on three public datasets against seven baselines showed the merits of TRE in terms of knowledge base completion accuracy as well as computational complexity.
Biddle, R, Liu, S & Xu, G 2018, 'Semi-Supervised Soft K-Means Clustering of Life Insurance Questionnaire Responses', Proceedings - 2018 5th International Conference on Behavioral, Economic, and Socio-Cultural Computing, BESC 2018, International Conference on Behavioral, Economic, and Socio-Cultural Computing, IEEE, Kaohsiung, Taiwan, pp. 30-31.View/Download from: Publisher's site
© 2018 IEEE. The life insurance questionnaire is a large document containing responses in a mixture of structured and unstructured data. The unstructured data poses issues for the user, in the form of extra input effort, and the insurance company, in the form of interpretation and analysis. In this work, we aim to address these problems by proposing a semi-supervised framework for clustering responses into categories using vector space embedding of responses and soft k-means clustering. Our experiments show that our method achieves adequate results. The resulting category clusters from our method can be used for analysis and to replace free text input questions with structured questions in the questionnaire.
Biddle, R, Liu, S, Tilocca, P & Xu, G 2018, 'Automated Underwriting in Life Insurance: Predictions and Optimisation', ADC 2018: Databases Theory and Applications (LNCS), Australasian Database Conference, Springer, Gold Coast, QLD, Australia, pp. 135-146.View/Download from: Publisher's site
Underwriting is an important stage in the life insurance process and is concerned with accepting individuals into an insurance fund and on what terms. It is a tedious and labour-intensive process for both the applicant and the underwriting team. An applicant must fill out a large survey containing thousands of questions about their life. The underwriting team must then process this application and assess the risks posed by the applicant and offer them insurance products as a result. Our work implements and evaluates classical data mining techniques to help automate some aspects of the process to ease the burden on the underwriting team as well as optimise the survey to improve the applicant experience. Logistic Regression, XGBoost and Recursive Feature Elimination are proposed as techniques for the prediction of underwriting outcomes. We conduct experiments on a dataset provided by a leading Australian life insurer and show that our early-stage results are promising and serve as a foundation for further work in this space.
Vo, NNY, Liu, S, He, X & Xu, G 2018, 'Multimodal Mixture Density Boosting Network for Personality Mining', Advances in Knowledge Discovery and Data Mining (LNCS), Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, Melbourne, Australia, pp. 644-655.View/Download from: Publisher's site
Knowing people's personalities is useful in various real-world applications, such as personnel selection. Traditionally, we have to rely on qualitative methodologies, e.g. surveys or psychology tests to determine a person's traits. However, recent advances in machine learning have it possible to automate this process by inferring personalities from textual data. Despite of its success, text-based method ignores the facial expression and the way people speak, which can also carry important information about human characteristics. In this work, a personality mining framework is proposed to exploit all the information from videos, including visual, auditory, and textual perspectives. Using a state-of-art cascade network built on advanced gradient boosting algorithms, the result produced by our proposed methodology can achieve lower the prediction errors than most current machine learning algorithms. Our multimodal mixture density boosting network especially perform well with small sample size datasets, which is useful for learning problems in psychology fields where big data is often not available.
Vo, NNY, Xu, G, Liu, S, Brownlow, EJ, Culbert, B & Chu, C 2018, 'Client Churn Prediction with Call Log Analysis', Database Systems for Advanced Applications, International Conference on Database Systems for Advanced Applications, Springer, Gold Coast, Australia, pp. 752-763.View/Download from: Publisher's site
Yin, J, Zhou, Z, Liu, S, Wu, Z & Xu, G 2018, 'Social Spammer Detection: A Multi-Relational Embedding Approach', Pacific-Asia Conference on Knowledge Discovery and Data, Springer Link, Melbourne, VIC, Australia.View/Download from: Publisher's site
Zhou, Z, Liu, S, Xu, G, Xie, X, Yin, J, Li, Y & Zhang, W 2018, 'Knowledge-based Recommendation with Hierarchical Collaborative Embedding', PAKDD 2018: Advances in Knowledge Discovery and Data Mining, Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, Melbourne, Australia, pp. 222-234.View/Download from: Publisher's site
Data sparsity is a common issue in recommendation systems, particularly collaborative filtering. In real recommendation scenarios, user preferences are often quantitatively sparse because of the application nature. To address the issue, we proposed a knowledge graph-based semantic information enhancement mechanism to enrich the user preferences. Specifically, the proposed Hierarchical Collaborative Embedding (HCE) model leverages both network structure and text info embedded in knowledge bases to supplement traditional collaborative filtering. The HCE model jointly learns the latent representations from user preferences, linkages between items and knowledge base, as well as the semantic representations from knowledge base. Experiment results on GitHub dataset demonstrated that semantic information from knowledge base has been properly captured, resulting improved recommendation performance.
Liu, S, Pang, N, Xu, G & Liu, H 2017, 'Collaborative Filtering via Different Preference Structures', International Conference on Knowledge Science, Engineering and Management, Springer, Melbourne, Australia, pp. 309-321.
Recently, social network websites start to provide third-parity sign-in options via the OAuth 2.0 protocol. For example, users can login Netflix website using their Facebook accounts. By using this service, accounts of the same user are linked together, and so does their information. This fact provides an opportunity of creating more complete profiles of users, leading to improved recommender systems. However, user opinions distributed over different platforms are in different preference structures, such as ratings, rankings, pairwise comparisons, voting, etc. As existing collaborative filtering techniques assume the homogeneity of preference structure, it remains a challenge task of how to learn from different preference structures simultaneously. In this paper, we propose a fuzzy preference relation-based approach to enable collaborative filtering via different preference structures. Experiment results on public datasets demonstrate that our approach can effectively learn from different preference structures, and show strong resistance to noises and biases introduced by cross-structure preference learning.
Liu, S, Xu, G, Zhu, X & Zhou, Z 2017, 'Towards Simplified Insurance Application via Sparse Questionnaire Optimization', 2017 International Conference on Behavioral, Economic, Socio-cultural Computing (BESC), International Conference on Behavioral, Economic, and Socio-Cultural Computing, IEEE, Poland.View/Download from: Publisher's site
Life insurance application requires in-person meetings with underwriters, tedious paperwork, and an average waiting period of six weeks before an offer can be made. This outdated process has become a barrier for broader consumer adoption, resulting large coverage gap. In this work, we aim to closing this gap by leveraging data mining techniques to optimize the insurance questionnaire form. Our experiment on 10 years of insurance application data has identified that only ~2% of all questions have shown high relevancy to determining the risks of applicants, resulting a significantly simplified questionnaire.
Pang, N, Zhu, D, Li, G & Liu, S 2017, 'WarnFi: Non-Invasive WiFi-based Abnormal Activity Sensing Using Non-parametric Model', IEEE Military Communications Conference, IEEE, Baltimore, MD, USA.View/Download from: Publisher's site
Zhou, Z, Xu, G, Zhu, X & Liu, S 2017, 'Latent Factor Analysis for Low-dimensional Implicit Preference Prediction', International Conference on Behavioral, Economic, Socio-cultural Computing, IEEE, Poland, pp. 1-2.View/Download from: Publisher's site
User preference prediction aims to predict a users future preferences on a large number of items according to his/her preference history. To achieve this goal, many models have been proposed, but mainly for explicit preference data, such as 5-star ratings. Nevertheless, real-world data are often in implicit format, such as purchase action, and the number of items is not always large. In this paper, we demonstrate the use of latent factor models for solving the task of predicting user preferences on implicit and low-dimensional dataset.
Zhu, D, Pang, N, Li, G & Liu, S 2017, 'NotiFi: Non-Invasive Abnormal Activity Detection Using Fine-grained WiFi Signals', Proceedings of the International Joint Conference on Neural Networks (IJCNN), International Joint Conference on Neural Networks, IEEE, Anchorage, Alaska, USA, pp. 1766-1773.View/Download from: Publisher's site
We build an ubiquitous abnormal activity detection system, namely NotiFi, for accurately detecting the abnormal activities on commercial off-the-shelf (COTS) IEEE 802.11 devices. In contrast to the traditional wearable sensor based and computer vision based systems which require additional sensors or enough lighting in line-of-sight (LoS) scenario, we proceed directly with abnormal activity characterization and activity modeling at the WiFi signal level based on Channel State Information (CSI). The intuition of NotiFi is that whenever the human body occludes the wireless signal transmitting from the access point to the receiver, the phase and the amplitude information of Channel State Information (CSI) will change sensitively. By creating a multiple hierarchical Dirichlet processes, NotiFi automatically learns the number of human body activity categories for abnormal detection. Experimental results in three typical indoor environments indicate that NotiFi can achieve satisfactory performance in accuracy, robustness and stability.
Zhu, D, Pang, N, Li, G & Liu, S 2016, 'WiseFi: Activity Localization and Recognition on Commodity Off-the-shelf WiFi Devices', IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems, IEEE, Sydney, Australia.View/Download from: Publisher's site
Liu, S, Li, G, Tran, T & Jiang, Y 2015, 'Preference Relation-based Markov Random Fields forRecommender Systems', ACML 2015 : Proceedings of 7th Asian Conference on Machine Learning, Asian Conference on Machine Learning, The Proceedings of Machine Learning Research, Hong Kong, pp. 1-16.
A preference relation-based Top-N recommendation approach, PrefMRF, is proposed to capture both the second-order and the higher-order interactions among users and items. Traditionally Top-N recommendation was achieved by predicting the item ratings fi rst, and then inferring the item rankings, based on the assumption of availability of explicit feed-backs such as ratings, and the assumption that optimizing the ratings is equivalent to optimizing the item rankings. Nevertheless, both assumptions are not always true in real world applications. The proposed PrefMRF approach drops these assumptions by explicitly exploiting the preference relations, a more practical user feedback. Comparing to related work, the proposed PrefMRF approach has the unique property of modeling both the second-order and the higher-order interactions among users and items. To the best of our knowledge, this is the first time both types of interactions have been captured in preference relation-based method. Experiment results on public datasets demonstrate that both types of interactions have been properly captured, and signifi cantly improved Top-N recommendation performance has been achieved.
Liu, S, Tran, T, Li, G & Jiang, Y 2014, 'Ordinal Random Fields for Recommender Systems', JMLR: Workshop and Conference Proceedings, Asian Conference on Machine Learning, The Proceedings of Machine Learning Research, Nha Trang City, Vietnam.
Recommender Systems heavily rely on numerical preferences, whereas the importance of
ordinal preferences has only been recognised in recent works of Ordinal Matrix Factorisation
(OMF). Although the OMF can effectively exploit ordinal properties, it captures only
the higher-order interactions among users and items, without considering the localised
interactions properly. This paper employs Markov Random Fields (MRF) to investigate the
localised interactions, and proposes a unified model called Ordinal Random Fields (ORF)
to take advantages of both the representational power of the MRF and the ease of modelling
ordinal preferences by the OMF. Experimental result on public datasets demonstrates that
the proposed ORF model can capture both types of interactions, resulting in improved
Moonsamy, V, Rong, J, Liu, S, Li, G & Batten, L 2013, 'Contrasting Permission Patterns between Clean and Malicious Android Applications', Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, International Conference on Security and Privacy for Communication Networks, Springer, Sydney, Australia, pp. 69-85.View/Download from: Publisher's site
The Android platform uses a permission system model to allow users and developers to regulate access to private information and system resources required by applications. Permissions have been proved to be useful for inferring behaviors and characteristics of an application. In this paper, a novel method to extract contrasting permission patterns for clean and malicious applications is proposed. Contrary to existing work, both required and used permissions were considered when discovering the patterns. We evaluated our methodology on a clean and a malware dataset, each comprising of 1227 applications. Our empirical results suggest that our permission patterns can capture key differences between clean and malicious applications, which can assist in characterizing these two types of applications.
Vu, H, Liu, S, Li, Z & Li, G 2011, 'Microphone Identification using One Class-Classification Approach', The 2nd Workshop on Applications and Techniques in Information Security, International Conference on Applications and Techniques in Information Security (ATIS), Melbourne, Australia, pp. 30-37.
Rapid growth of technical developments has created huge challenges for microphone forensics -a sub-category of audio forensic science, because of the avail-ability of numerous digital recording devices and massive amount of recording data. Demand for fast and efficient methods to assure integrity and authenticity of information is becoming more and more important in criminal inves-tigation nowadays. Machine learning has emerged as an important technique to support audio analysis processes of microphone forensic practitioners. However, its application to real life situations using supervised learning is still facing great challenges due to expensiveness in collecting data and updating system. In this paper, we introduce a new machine learning approach which is called One-class Classification (OCC) to be applied to microphone forensics; we demonstrate its capability on a corpus of audio samples collected from several microphones. Research results and analysis indicate that OCC has the potential to benefit microphone forensic practitioners in developing new tools and techniques for effective and efficient analysis.