Machine Bias, Open Set Recognition, Data Mining, Tensor Factorization, and Deep Learning.
Do, Q, Verma, S, Chen, F & Liu, W 2019, 'Multiple Knowledge Transfer for Cross-Domain Recommendation', PRICAI 2019: Trends in Artificial Intelligence, Pacific Rim International Conference on Artificial Intelligence, Spronger, Cuvu, Yanuca Island, Fij, pp. 529-542.View/Download from: Publisher's site
© 2019, Springer Nature Switzerland AG. Collaborative filtering based recommendation systems rely on underlying similarities among users and items across multiple dataset and hence requires sufficiently large amount of ratings data to achieve accurate and reliable results. However, newly established businesses do not have sufficient ratings data and hence this requirement is rarely met. In this research, we propose Multiple Latent Clusters (MultLC) transfer to exploit the correlations among multiple datasets that do not necessarily have an identical dimension of information. In particular, we transfer different aspects of knowledge across different data sources where while transferring each aspect from a source to the target, we only soft-transfer common latent clusters while preserving unique (domain-specific) latent clusters of the target. By soft-transfer, we mean that we minimize the difference among the shared clusters (while not making them identical). Comprehensive experiments on real-world datasets demonstrate the effectiveness of our proposed MultLC over other widely utilized cross-domain recommendation algorithms. The performance improvements demonstrate the benefits of transferring knowledge from multiple sources while preserving the unique information of the target-domain for cross-domain recommendations.
Verma, S, Wang, C, Zhu, L & Liu, W 2019, 'A Compliance Checking Framework for DNN Models', Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence Organization, China, pp. 6470-6471.View/Download from: Publisher's site
Growing awareness towards ethical use of machine learning (ML) models has created a surge for the development of fair models. Existing work in this regard assumes the presence of sensitive attributes in the data and hence can build classifiers whose decisions remain agnostic to such attributes. However, in the real world settings, the end-user of the ML model is unaware of the training data; besides, building custom models is not always feasible. Moreover, utilizing a pre-trained model with high accuracy on certain dataset can not be assumed to be fair. Unknown biases in the training data are the true culprit for unfair models (i.e., disparate performance for groups in the dataset). In this preliminary research, we propose a different lens for building fair models by enabling the user with tools to discover blind spots and biases in a pre-trained model and augment them with corrective measures.
Verma, S, Wang, C, Zhu, L & Liu, W 2019, 'DeepCU: Integrating both Common and Unique Latent Information for Multimodal Sentiment Analysis', Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence Organization, China, pp. 3627-3634.View/Download from: Publisher's site
Verma, S, Wang, C, Zhu, L & Liu, W 2019, 'Towards Effective Data Augmentations via Unbiased GAN Utilization', PRICAI 2019: Trends in Artificial Intelligence (LNAI), Pacific Rim International Conference on Artificial Intelligence, Springer, Cuvu, Yanuca Island, Fiji, pp. 555-567.View/Download from: Publisher's site
© 2019, Springer Nature Switzerland AG. The parameters of any machine learning (ML) model are obtained from the dataset on which the model is trained. However, existing research reveals that many datasets appear to have strong build-in biases. These biases are inherently learned by the learning mechanism of the ML model which adversely affects their generalization performance. In this research, we propose a new supervised data augmentation mechanism which we call as Data Augmentation Pursuit (DAP). The DAP generates labelled synthetic data instances for augmenting the raw datasets. To demonstrate the effectiveness of utilizing DAP for reducing model bias, we perform comprehensive experiments on real world image dataset. CNN models trained on augmented dataset obtained using DAP achieves significantly better classification performance and exhibits reduction in the bias learned by their learning mechanism.
Zhang, X, Zhang, X, Verma, S, Liu, Y, Blumenstein, M & Li, J 2019, 'Detection of Anomalous Traffic Patterns and Insight Analysis from Bus Trajectory Data', PRICAI 2019: Trends in Artificial Intelligence, The 16th Pacific Rim International Conference on Artificial Intelligence, Cuvu, Fiji.
Verma, S, Liu, W, Wang, C & Zhu, L 2018, 'Hybrid networks: Improving deep learning networks via integrating two views of images', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), International Conference on Neural Information Processing, Springer Link, Siem Reap, Cambodia, pp. 46-58.View/Download from: Publisher's site
© 2018, Springer Nature Switzerland AG. The principal component analysis network (PCANet) is an unsupervised parsimonious deep network, utilizing principal components as filters in the layers. It creates an amalgamated view of the data by transforming it into column vectors which destroys its spatial structure while obtaining the principal components. In this research, we first propose a tensor-factorization based method referred as the Tensor Factorization Networks (TFNet). The TFNet retains the spatial structure of the data by preserving its individual modes. This presentation provides a minutiae view of the data while extracting matrix factors. However, the above methods are restricted to extract a single representation and thus incurs information loss. To alleviate this information loss with the above methods we propose Hybrid Network (HybridNet) to simultaneously learn filters from both the views of the data. Comprehensive results on multiple benchmark datasets validate the superiority of integrating both the views of the data in our proposed HybridNet.
Verma, S, Liu, W, Wang, C & Zhu, L 2017, 'Extracting highly effective features for supervised learning via simultaneous tensor factorization', Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, AAAI, San Francisco, USA, pp. 4995-4996.
Copyright © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Real world data is usually generated over multiple time periods associated with multiple labels, which can be represented as multiple labeled tensor sequences. These sequences are linked together, sharing some common features while exhibiting their own unique features. Conventional tensor factorization techniques are limited to extract either common or unique features, but not both simultaneously. However, both types of these features are important in many machine learning systems as they inherently affect the systems' performance. In this paper, we propose a novel supervised tensor factorization technique which simultaneously extracts ordered common and unique features. Classification results using features extracted by our method on CIFAR-10 database achieves significantly better performance over other factorization methods, illustrating the effectiveness of the proposed technique.