I am working as a Postdoctoral Research Associate at the Australian Artificial Intelligence Institute with the University of Technology Sydney (UTS). I have completed my Ph.D. degree in Computer Science in Mar 2020 from UTS, Australia. Before my doctorate, I received the B.E. degree in applied mathematics from Northwestern Polytechnical University, Xi’an, China, in 2014. I serve as a Program Committee for NeurIPS, ICML, ICLR, AISTATS, ACML, AAAI and IJCAI.
Can supervise: YES
Robust Rank Aggregation
Robust Bayesian Inference
Deep Generative Model
Li, J, Pan, Y, Sui, Y & Tsang, IW 2020, 'Secure Metric Learning via Differential Pairwise Privacy', IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, vol. 15, pp. 3640-3652.View/Download from: Publisher's site
Pan, Y, Tsang, IW, Singh, AK, Lin, C-T & Sugiyama, M 2020, 'Stochastic Multichannel Ranking with Brain Dynamics Preferences', NEURAL COMPUTATION, vol. 32, no. 8, pp. 1499-1530.View/Download from: Publisher's site
Han, B, Yao, Q, Pan, Y, Tsang, IW, Xiao, X, Yang, Q & Sugiyama, M 2019, 'Millionaire: a hint-guided approach for crowdsourcing', Machine Learning, vol. 108, pp. 831-858.View/Download from: Publisher's site
© 2018, The Author(s). Modern machine learning is migrating to the era of complex models, which requires a plethora of well-annotated data. While crowdsourcing is a promising tool to achieve this goal, existing crowdsourcing approaches barely acquire a sufficient amount of high-quality labels. In this paper, motivated by the "Guess-with-Hints" answer strategy from the Millionaire game show, we introduce the hint-guided approach into crowdsourcing to deal with this challenge. Our approach encourages workers to get help from hints when they are unsure of questions. Specifically, we propose a hybrid-stage setting, consisting of the main stage and the hint stage. When workers face any uncertain question on the main stage, they are allowed to enter the hint stage and look up hints before making any answer. A unique payment mechanism that meets two important design principles for crowdsourcing is developed. Besides, the proposed mechanism further encourages high-quality workers less using hints, which helps identify and assigns larger possible payment to them. Experiments are performed on Amazon Mechanical Turk, which show that our approach ensures a sufficient number of high-quality labels with low expenditure and detects high-quality workers.
© 2017, The Author(s). The aggregation of k-ary preferences is an emerging ranking problem, which plays an important role in several aspects of our daily life, such as ordinal peer grading and online product recommendation. At the same time, crowdsourcing has become a trendy way to provide a plethora of k-ary preferences for this ranking problem, due to convenient platforms and low costs. However, k-ary preferences from crowdsourced workers are often noisy, which inevitably degenerates the performance of traditional aggregation models. To address this challenge, in this paper, we present a RObust PlAckett–Luce (ROPAL) model. Specifically, to ensure the robustness, ROPAL integrates the Plackett–Luce model with a denoising vector. Based on the Kendall-tau distance, this vector corrects k-ary crowdsourced preferences with a certain probability. In addition, we propose an online Bayesian inference to make ROPAL scalable to large-scale preferences. We conduct comprehensive experiments on simulated and real-world datasets. Empirical results on "massive synthetic" and "real-world" datasets show that ROPAL with online Bayesian inference achieves substantial improvements in robustness and noisy worker detection over current approaches.
© 2018, The Author(s). The aggregation of k-ary preferences is a novel ranking problem that plays an important role in several aspects of daily life, such as ordinal peer grading, online image-rating, meta-search and online product recommendation. Meanwhile, crowdsourcing is increasingly emerging as a way to provide a plethora of k-ary preferences for these types of ranking problems, due to the convenience of the platforms and the lower costs. However, preferences from crowd workers are often noisy, which inevitably degenerates the reliability of conventional aggregation models. In addition, traditional inferences usually lead to massive computational costs, which limits the scalability of aggregation models. To address both of these challenges, we propose a reliable CrowdsOUrced Plackett–LucE (COUPLE) model combined with an efficient Bayesian learning technique. To ensure reliability, we introduce an uncertainty vector for each crowd worker in COUPLE, which recovers the ground truth of the noisy preferences with a certain probability. Furthermore, we propose an Online Generalized Bayesian Moment Matching (OnlineGBMM) algorithm, which ensures that COUPLE is scalable to large-scale datasets. Comprehensive experiments on four large-scale synthetic datasets and three real-world datasets show that, COUPLE with OnlineGBMM achieves substantial improvements in reliability and noisy worker detection over other well-known approaches.
Shi, Y, Xu, D, Pan, Y, Tsang, IW & Pan, S 2019, 'Label Embedding with Partial Heterogeneous Contexts', THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 33rd AAAI Conference on Artificial Intelligence / 31st Innovative Applications of Artificial Intelligence Conference / 9th AAAI Symposium on Educational Advances in Artificial Intelligence, ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE, Honolulu, HI, pp. 4926-4933.