Dr Wei Bian is a Lecturer with the Centre for Artificial Intelligence (CAI) and the Faculty of Engineering and Information Technology in the University of Technology Sydney (UTS). He also held a Chief Algorithm Engineer position in DiDi Chuxing Technology Co., the world's largest company for ride-sharing service and transport transformation, from 2016 to 2018.
Dr Bian's main research interests focus on computer vision and machine learning. He has worked on learning succinct representations for vision problems including image/video classification, facial, posture and activity recognition, and on machine learning problems from large-scale high-dimensional data analysis, probabilistic graphical models, to multi-label/multi-view data classification.
Dr Bian has published more than 40 papers top tire journals and conferences of his field, including IEEE TPAMI, IEEE TIP, IEEE TNN, NIPS, AISTATS, CVPR, IJCAI, AAAI, ICDM, and KDD. He has served as a program committee member for IJCAI, AAAI, AISTATS and ICME.
Can supervise: YES
Qiao, M, Yu, J, Bian, W, Li, Q & Tao, D 2019, 'Adapting Stochastic Block Models to Power-Law Degree Distributions.', IEEE Transactions on Cybernetics, vol. 49, no. 2, pp. 626-637.View/Download from: Publisher's site
Stochastic block models (SBMs) have been playing an important role in modeling clusters or community structures of network data. But, it is incapable of handling several complex features ubiquitously exhibited in real-world networks, one of which is the power-law degree characteristic. To this end, we propose a new variant of SBM, termed power-law degree SBM (PLD-SBM), by introducing degree decay variables to explicitly encode the varying degree distribution over all nodes. With an exponential prior, it is proved that PLD-SBM approximately preserves the scale-free feature in real networks. In addition, from the inference of variational E-Step, PLD-SBM is indeed to correct the bias inherited in SBM with the introduced degree decay factors. Furthermore, experiments conducted on both synthetic networks and two real-world datasets including Adolescent Health Data and the political blogs network verify the effectiveness of the proposed model in terms of cluster prediction accuracies.
Li, Q, Xie, B, You, J, Bian, W & Tao, D 2016, 'Correlated Logistic Model With Elastic Net Regularization for Multilabel Image Classification', IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 25, no. 8, pp. 3801-3813.View/Download from: Publisher's site
Qiao, M, Xu, RYD, Bian, W & Tao, D 2016, 'Fast Sampling for Time-Varying Determinantal Point Processes', ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, vol. 11, no. 1.View/Download from: Publisher's site
Zeng, X, Bian, W, Liu, W, Shen, J & Tao, D 2015, 'Dictionary Pair Learning on Grassmann Manifolds for Image Denoising', IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 24, no. 11, pp. 4556-4569.View/Download from: Publisher's site
Qiao, M, Bian, W, Xu, RYD & Tao, D 2015, 'Diversified Hidden Markov Models for Sequential Labeling', IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, vol. 27, no. 11, pp. 2947-2960.View/Download from: Publisher's site
Bian, W & Tao, D 2014, 'Asymptotic Generalization Bound of Fisher's Linear Discriminant Analysis', IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 36, no. 12, pp. 2325-2337.View/Download from: Publisher's site
Bian, W & Tao, D 2014, 'Asymptotic Generalization Bound of Fisher's Linear Discriminant Analysis.', IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, pp. 2325-2337.View/Download from: Publisher's site
Bian, W, Zhou, T, Martinez, AM, Baciu, G & Tao, D 2014, 'Minimizing Nearest Neighbor Classification Error for Nonparametric Dimension Reduction', IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 8, pp. 1588-1594.View/Download from: Publisher's site
In this brief, we show that minimizing nearest neighbor classification error (MNNE) is a favorable criterion for supervised linear dimension reduction (SLDR). We prove that MNNE is better than maximizing mutual information in the sense of being a proxy of the Bayes optimal criterion. Based on kernel density estimation, we derive a nonparametric algorithm for MNNE. Experiments on benchmark data sets show the superiority of MNNE over existing nonparametric SLDR methods.
Posture segmentation plays an essential role in human motion analysis. The state-of-the-art method extracts sufficiently high-dimensional features from 3D depth images for each 3D point and learns an efficient body part classifier. However, high-dimensional features are memory-consuming and difficult to handle on large-scale training dataset. In this paper, we propose an efficient two-stage dimension reduction scheme, termed biview learning, to encode two independent views which are depth-difference features (DDF) and relative position features (RPF). Biview learning explores the complementary property of DDF and RPF, and uses two stages to learn a compact yet comprehensive low-dimensional feature space for posture segmentation. In the first stage, discriminative locality alignment (DLA) is applied to the high-dimensional DDF to learn a discriminative low-dimensional representation. In the second stage, canonical correlation analysis (CCA) is used to explore the complementary property of RPF and the dimensionality reduced DDF. Finally, we train a support vector machine (SVM) over the output of CCA. We carefully validate the effectiveness of DLA and CCA utilized in the two-stage scheme on our 3D human points cloud dataset. Experimental results show that the proposed biview learning scheme significantly outperforms the state-of-the-art method for human posture segmentation.
Cheng, J, Bian, W & Tao, D 2013, 'Locally regularized sliced inverse regression based 3D hand gesture recognition on a dance robot', Information Sciences, vol. 221, pp. 274-283.View/Download from: Publisher's site
Gesture recognition plays an important role in human machine interactions (HMIs) for multimedia entertainment. In this paper, we present a dimension reduction based approach for dynamic real-time hand gesture recognition. The hand gestures are recorded as acceleration signals by using a handheld with a 3-axis accelerometer sensor installed, and represented by discrete cosine transform (DCT) coefficients. To recognize different hand gestures, we develop a new dimension reduction method, locally regularized sliced inverse regression (LR-SIR), to find an effective low dimensional subspace, in which different hand gestures are well separable, following which recognition can be performed by using simple and efficient classifiers, e.g., nearest mean, k-nearest-neighbor rule and support vector machine. LR-SIR is built upon the well-known sliced inverse regression (SIR), but overcomes its limitation that it ignores the local geometry of the data distribution. Besides, LR-SIR can be effectively and efficiently solved by eigen-decomposition. Finally, we apply the LR-SIR based gesture recognition to control our recently developed dance robot for multimedia entertainment. Thorough empirical studies on `digits-gesture recognition suggest the effectiveness of the new gesture recognition scheme for HMI.
Tang, J, Bian, W, Yu, N & Zhang, Y 2013, 'Intelligent processing techniques for semantic-based image and video retrieval', Neurocomputing, vol. 119, no. 1, pp. 1-2.View/Download from: Publisher's site
Rapid advances in technology for capturing, processing, distributing, storing, and presenting visual data have resulted in a proliferation of image and video data in human lives. The multimedia research community has widely recognized the importance of searching images or videos from a large-scale corpus or the Internet. Intelligent processing techniques are the most useful tools to achieve this objective and the recent years have witnessed very significant contributions of intelligent algorithms in multimedia search. Intelligent information processing, such as machine learning techniques, knowledge discovery and data mining, computer vision and cognitive computation, natural language processing, and intelligent humanmachine interfaces, have great influence on extracting the semantic information for image and video data. These techniques light a way to make the semantic-based image/video retrieval come true. At least, they provide us a reasonable direction to touch the semantic retrieval. The goals of this special issue are three-fold: (1) introduce novel research in intelligent processing techniques for semantic-based image and video retrieval; (2) survey on the progress of this area in the past years; and (3) discuss new applications based on semantic-based image and video retrieval models.
Wang, X, Bian, W & Tao, D 2013, 'Grassmannian regularized structured multi-view embedding for image classification', IEEE Transactions On Image Processing, vol. 22, no. 7, pp. 2646-2660.View/Download from: Publisher's site
Images are usually represented by features from multiple views, e.g., color and texture. In image classification, the goal is to fuse all the multi-view features in a reasonable manner and achieve satisfactory classification performance. However, the fea
The capability of inferring colours from the texture (grayscale contents) of an image is useful in many application areas, when the imaging device/environment is limited. Traditional manual or limited automatic colour assignment involves intensive human
Bian, W & Tao, D 2012, 'Constrained Empirical Risk Minimization Framework For Distance Metric Learning', IEEE Transactions On Neural Networks And Learning Systems, vol. 23, no. 8, pp. 1194-1205.View/Download from: Publisher's site
Distance metric learning (DML) has received increasing attention in recent years. In this paper, we propose a constrained empirical risk minimization framework for DML. This framework enriches the state-of-the-art studies on both theoretic and algorithmi
Bian, W, Tao, D & Rui, Y 2012, 'Cross-Domain Human Action Recognition', Ieee Transactions On Systems Man And Cybernetics Part B-Cybernetics, vol. 42, no. 2, pp. 298-307.View/Download from: Publisher's site
Conventional human action recognition algorithms cannot work well when the amount of training videos is insufficient. We solve this problem by proposing a transfer topic model (TTM), which utilizes information extracted from videos in the auxiliary domai
Cheng, J, Xie, C, Bian, W & Tao, D 2012, 'Feature fusion for 3D hand gesture recognition by learning a shared hidden space', Pattern Recognition Letters, vol. 33, no. 4, pp. 476-484.View/Download from: Publisher's site
Hand gesture recognition has been intensively applied in various humancomputer interaction (HCI) systems. Different hand gesture recognition methods were developed based on particular features, e.g., gesture trajectories and acceleration signals. However, it has been noticed that the limitation of either features can lead to flaws of a HCI system. In this paper, to overcome the limitations but combine the merits of both features, we propose a novel feature fusion approach for 3D hand gesture recognition. In our approach, gesture trajectories are represented by the intersection numbers with randomly generated line segments on their 2D principal planes, acceleration signals are represented by the coefficients of discrete cosine transformation (DCT). Then, a hidden space shared by the two features is learned by using penalized maximum likelihood estimation (MLE). An iterative algorithm, composed of two steps per iteration, is derived to for this penalized MLE, in which the first step is to solve a standard least square problem and the second step is to solve a Sylvester equation. We tested our hand gesture recognition approach on different hand gesture sets. Results confirm the effectiveness of the feature fusion method.
Query difficulty estimation predicts the performance of the search result of the given query. It is a powerful tool for multimedia retrieval and receives increasing attention. It can guide the pseudo relevance feedback to rerank the image search results
Yu, JX, Bian, W, Song, M, Cheng, JL & Tao, D 2012, 'Graph Based Transductive Learning For Cartoon Correspondence Construction', Neurocomputing, vol. 79, pp. 105-114.View/Download from: Publisher's site
Correspondence construction of characters in key frames is the prerequisite for cartoon animations' automatic inbetweening and coloring. Since each frame of an animation consists of multiple layers, characters are complicated in terms of shape and struct
Zhang, C, Bian, W, Tao, D & Weisi, L 2012, 'Discretized-Vapnik-Chervonenkis Dimension For Analyzing Complexity Of Real Function Classes', IEEE Transactions On Neural Networks And Learning Systems, vol. 23, no. 9, pp. 1461-1472.View/Download from: Publisher's site
In this paper, we introduce the discretized-Vapnik-Chervonenkis (VC) dimension for studying the complexity of a real function class, and then analyze properties of real function classes and neural networks. We first prove that a countable traversal set i
In this paper, we propose an approach termed segment-based features (SBFs) to classify time series. The approach is inspired by the success of the component- or part-based methods of object recognition in computer vision, in which a visual object is described as a number of characteristic parts and the relations among the parts. Utilizing this idea in the problem of time series classification, a time series is represented as a set of segments and the corresponding temporal relations. First, a number of interest segments are extracted by interest point detection with automatic scale selection. Then, a number of feature prototypes are collected by random sampling from the segment set, where each feature prototype may include single segment or multiple ordered segments. Subsequently, each time series is transformed to a standard feature vector, i.e. SBF, where each entry in the SBF is calculated as the maximum response (maximum similarity) of the corresponding feature prototype to the segment set of the time series.
Bian, W & Tao, D 2011, 'Max-Min Distance Analysis By Using Sequential SDP Relaxation For Dimension Reduction', IEEE Transactions On Pattern Analysis And Machine Intelligence, vol. 33, no. 5, pp. 1037-1050.View/Download from: Publisher's site
Abstract - We propose a new criterion for discriminative dimension reduction, max-min distance analysis (MMDA). Given a data set with C classes, represented by homoscedastic Gaussians, MMDA maximizes the minimum pairwise distance of these C classes in the selected low-dimensional subspace. Thus, unlike Fishers linear discriminant analysis (FLDA) and other popular discriminative dimension reduction criteria, MMDA duly considers the separation of all class pairs. To deal with general case of data distribution, we also extend MMDA to kernel MMDA (KMMDA). Dimension reduction via MMDA/KMMDA leads to a nonsmooth max-min optimization problem with orthonormal constraints. We develop a sequential convex relaxation algorithm to solve it approximately. To evaluate the effectiveness of the proposed criterion and the corresponding algorithm, we conduct classification and data visualization experiments on both synthetic data and real data sets. Experimental results demonstrate the effectiveness of MMDA/KMMDA associated with the proposed optimization algorithm.
Cheng, JL, Qiao, M, Bian, W & Tao, D 2011, '3D Human Posture Segmentation By Spectral Clustering With Surface Normal Constraint', Signal Processing, vol. 91, no. 9, pp. 2204-2212.View/Download from: Publisher's site
In this paper, we propose a new algorithm for partitioning human posture represented by 3D point clouds sampled from the surface of human body. The algorithm is formed as a constrained extension of the recently developed segmentation method, spectral clu
Bian, W & Tao, D 2010, 'Biased Discriminant Euclidean Embedding for Content-Based Image Retrieval', IEEE Transactions On Image Processing, vol. 19, no. 2, pp. 545-554.View/Download from: Publisher's site
With many potential multimedia applications, content-based image retrieval (CBIR) has recently gained more attention for image management and web search. A wide variety of relevance feedback (RF) algorithms have been developed in recent years to improve
Qiao, M, Yu, J, Bian, W, Li, Q & Tao, D 2017, 'Improving Stochastic block models by incorporating power-law degree characteristic', IJCAI International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence Organization, Melbourne, Australia, pp. 2620-2626.
Stochastic block models (SBMs) provide a statistical way modeling network data, especially in representing clusters or community structures. However, most block models do not consider complex characteristics of networks such as scale-free feature, making them incapable of handling degree variation of vertices, which is ubiquitous in real networks. To address this issue, we introduce degree decay variables into SBM, termed power-law degree SBM (PLD-SBM), to model the varying probability of connections between node pairs. The scale-free feature is approximated by a power-law degree characteristic. Such a property allows PLD-SBM to correct the distortion of degree distribution in SBM, and thus improves the performance of cluster prediction. Experiments on both simulated networks and two real-world networks including the Adolescent Health Data and the political blogs network demonstrate the validity of the motivation of PLD-SBM, and its practical superiority.
Tao, D, yu, X & bian, W 2016, 'Scalable completion of nonnegative matrices with the separable structure', 30th AAAI Conference on Artificial Intelligence, AAAI 2016, AAAI Conference on Artificial Intelligence, AAAI Press, Pheonix, Arizona, United States of America, pp. 2279-2285.
Xiong, W, Du, B, Zhang, L, Hu, R, Bian, W, Shen, J & Tao, D 2015, 'R2FP: Rich and robust feature pooling for mining visual data', Proceedings - IEEE International Conference on Data Mining, ICDM, IEEE International Conference on Data Mining, IEEE, Atlantic City, NJ, pp. 469-478.View/Download from: Publisher's site
© 2015 IEEE. The human visual system proves smart in extracting both global and local features. Can we design a similar way for unsupervised feature learning? In this paper, we propose anovel pooling method within an unsupervised feature learningframework, named Rich and Robust Feature Pooling (R2FP), to better explore rich and robust representation from sparsefeature maps of the input data. Both local and global poolingstrategies are further considered to instantiate such a methodand intensively studied. The former selects the most conductivefeatures in the sub-region and summarizes the joint distributionof the selected features, while the latter is utilized to extractmultiple resolutions of features and fuse the features witha feature balancing kernel for rich representation. Extensiveexperiments on several image recognition tasks demonstratethe superiority of the proposed techniques.
Zhang, Q, Zhang, L, Du, B, Zheng, W, Bian, W & Tao, D 2015, 'MMFE: Multitask multiview feature embedding', Proceedings - IEEE International Conference on Data Mining, ICDM, IEEE International Conference on Data Mining, IEEE, Atlantic City, NJ, pp. 1105-1110.View/Download from: Publisher's site
© 2015 IEEE. In data mining and pattern recognition area, the learned objects are often represented by the multiple features from various of views. How to learn an efficient and effective feature embedding for the subsequent learning tasks? In this paper, we address this issue by providing a novel multi-task multiview feature embedding (MMFE) framework. The MMFE algorithm is based on the idea of low-rank approximation, which suggests that the observed multiview feature matrix is approximately represented by the low-dimensional feature embedding multiplied by a projection matrix. In order to fully consider the particular role of each view to the multiview feature embedding, we simultaneously suggest the multitask learning scheme and ensemble manifold regularization into the MMFE algorithm to seek the optimal projection. Since the objection function of MMFE is multi-variable and non-convex, we further provide an iterative optimization procedure to find the available solution. Two real world experiments show that the proposed method outperforms single-task-based as well as state-of-the-art multiview feature embedding methods for the classification problem.
Li, Q, Qiao, M, Bian, W & Tao, D 2016, 'Conditional graphical Lasso for multi-label image classification', Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, Nevada, United States, pp. 2977-2986.View/Download from: Publisher's site
Multi-label image classification aims to predict multiple labels for a single image which contains diverse content. By utilizing label correlations, various techniques have been developed to improve classification performance. However, current existing methods either neglect image features when exploiting label correlations or lack the ability to learn image-dependent conditional label structures. In this paper, we develop conditional graphical Lasso (CGL) to handle these challenges. CGL provides a unified Bayesian framework for structure and parameter learning conditioned on image features. We formulate the multi-label prediction as CGL inference problem, which is solved by a mean field variational approach. Meanwhile, CGL learning is efficient due to a tailored proximal gradient procedure by applying the maximum a posterior (MAP) methodology. CGL performs competitively for multi-label image classification on benchmark datasets MULAN scene, PASCAL VOC 2007 and PASCAL VOC 2012, compared with the state-of-the-art multi-label classification algorithms.
Li, Q, Bian, W, Xu, Y, You, J & Tao, D 2016, 'Random Mixed Field Model for Mixed-Attribute Data Restoration', Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, AAAI, Phoenix, Arizona, USA, pp. 1244-1250.
Noisy and incomplete data restoration is a critical preprocessing step in developing effective learning algorithms, which targets to reduce the effect of noise and missing values in data. By utilizing attribute correlations and/or instance similarities, various techniques have been developed for data denoising and imputation tasks. However, current existing data restoration methods are either specifically designed for a particular task, or incapable of dealing with mixed-attribute data. In this paper, we develop a new probabilistic model to provide a general and principled method for restoring mixed-attribute data. The main contributions of this study are twofold: a) a unified generative model, utilizing a generic random mixed field (RMF) prior, is designed to exploit mixed-attribute correlations; and b) a structured mean-field variational approach is proposed to solve the challenging inference problem of simultaneous denoising and imputation. We evaluate our method by classification experiments on both synthetic data and real benchmark datasets. Experiments demonstrate, our approach can effectively improve the classification accuracy of noisy and incomplete data by comparing with other data restoration methods.
Qiao, M, Bian, W, Da Xu, RY & Tao, D 2016, 'Diversified Hidden Markov Models for Sequential Labeling', 2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 32nd IEEE International Conference on Data Engineering (ICDE), IEEE, Helsinki, FINLAND, pp. 1512-+.View/Download from: Publisher's site
Zhou, T, Bian, W & Tao, D 2013, 'Divide-and-Conquer Anchoring for Near-Separable Nonnegative Matrix Factorization and Completion in High Dimensions', IEEE 13th International Conference on Data Mining, IEEE International Conference on Data Mining, IEEE, Dallas, TX, USA, pp. 917-926.View/Download from: Publisher's site
Abstract Nonnegative matrix factorization (NMF) becomes tractable in polynomial time with unique solution under separability assumption , which postulates all the data points are contained in the conical hull of a few anchor data points. Recently developed linear programming and greedy pursuit methods can pick out the anchors from noisy data and results in a near-separable NMF. But their efficiency could be seriously weakened in high dimensions. In this paper, we show that the anchors can be precisely located from low- dimensional geometry of the data points even when their high dimensional features suffer from serious incompleteness. Our framework, entitled divide-and-conquer anchoring (DCA), divides the high-dimensional anchoring problem into a few cheaper sub-problems seeking anchors of data projections in low-dimensional random spaces, which can be solved in parallel by any near-separable NMF, and combines all the detected low-dimensional anchors via a fast hypothesis testing to identify the original anchors. We further develop two non- iterative anchoring algorithms in 1D and 2D spaces for data in convex hull and conical hull, respectively. These two rapid algorithms in the ultra low dimensions suffice to generate a robust and efficient near-separable NMF for high-dimensional or incomplete data via DCA. Compared to existing methods, two vital advantages of DCA are its scalability for big data, and capability of handling incomplete and high-dimensional noisy data. A rigorous analysis proves that DCA is able to find the correct anchors of a rank- k matrix by solving O ( k log k ) sub- problems. Finally, we show DCA outperforms state-of-the-art methods on various datasets and tasks.
Bian, W, Xie, B & Tao, D 2012, 'CorrLog: Correlated Logistic Models for Joint Prediction of Multiple Labels', Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, International Conference on Artificial Intelligence and Statistics, the MIT Press, La Palma, Canary Islands, pp. 109-117.
In this paper, we present a simple but effective method for multi-label classification (MLC), termed Correlated Logistic Models (Corrlog), which extends multiple Independent Logistic Regressions (ILRs) by modeling the pairwise correlation between labels. Algorithmically, we propose an efficient method for learning parameters of Corrlog, which is based on regularized maximum pseudolikelihood estimation and has a linear computational complexity with respect to the number of labels. Theoretically, we show that Corrlog enjoys a satisfying generalization bound which is independent of the number of labels. The effectiveness of Corrlog on modeling label correlations is illustrated by a toy example, and further experiments on real data show that Corrlog achieves competitive performance compared with popular MLC algorithms.
Bian, W & Tao, D 2011, 'Learning a Distance Metric by Empirical Loss Minimization', Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, AAAI Press/International Joint Conferences on Artificial Intelligence, Barcelona, Catalonia, Spain, pp. 1186-1191.View/Download from: Publisher's site
In this paper, we study the problem of learning a metric and propose a loss function based metric learning framework, in which the metric is estimated by minimizing an empirical risk over a training set. With mild conditions on the instance distribution and the used loss function, we prove that the empirical risk converges to its expected counterpart at rate of root-n. In addition, with the assumption that the best metric that minimizes the expected risk is bounded, we prove that the learned metric is consistent. Two example algorithms are presented by using the proposed loss function based metric learning framework, each of which uses a log loss function and a smoothed hinge loss function, respectively. Experimental results suggest the effectiveness of the proposed algorithms.
Tao, D, Li, Z, Li, J, Katsaggelos, A, Bian, W, Chen, Y, Fan, J, Hu, Y, Izquierdo, E, Ji, S, Jiang, X, Kwok, J, Li, Q, Liu, J, Loog, M, Lu, H, Lu, YL, Maybank, SJ, Pau, D, Ro, YM, Shan, C, Shao, L, Smeraldi, F, Song, Y, Wang, F, Xu, Y, Yang, L, Ye, J, Yu, J, Zhang, D, Zhang, J, Zhao, X, Huang, K, Ying, Y & Zhou, C 2011, 'Preface', Proceedings - IEEE International Conference on Data Mining, ICDM.View/Download from: Publisher's site
Xie, B, Bian, W, Tao, D & Chordia, P 2011, 'Music tagging with regularized logistic regression', Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011, International Society for Music Information Retrieval Conference, University of Miami, Miami, Florida, USA, pp. 711-716.
In this paper, we present a set of simple and efficient regularized logistic regression algorithms to predict tags of music. We first vector-quantize the delta MFCC features using k-means and construct "bag-of-words" representation for each song. We then learn the parameters of these logistic regression algorithms from the "bag-of- words" vectors and ground truth labels in the training set. At test time, the prediction confidence by the linear classifiers can be used to rank the songs for music annotation and retrieval tasks. Thanks to the convex property of the objective functions, we adopt an efficient and scalable generalized gradient method to learn the parameters, with global optimum guaranteed. And we show that these efficient algorithms achieve stateof- the-art performance in annotation and retrieval tasks evaluated on CAL-500. © 2011 International Society for Music Information Retrieval.
Zhang, L, Bian, W, Song, M, Tao, D & Liu, X 2011, 'Integrating Local Features into Discriminative Graphlets for Scene Classification', Lecture Notes in Computer Science. Neural Information Processing. 18th International Conference, ICONIP 2011, International Conference on Neural Information Processing, Springer-Verlag Berlin / Heidelberg, Shanghai, China, pp. 657-666.View/Download from: Publisher's site
Scene classification plays an important role in multimedia information retrieval. Since local features are robust to image transformation, they have been used extensively for scene classification. However, it is difficult to encode the spatial relations of local features in the classification process. To solve this problem, Geometric Local Features Integration(GLFI) is proposed. By segmenting a scene image into a set of regions, a so-called Region Adjacency Graph(RAG) is constructed to model their spatial relations. To measure the similarity of two RAGs, we select a few discriminative templates and then use them to extract the corresponding discriminative graphlets(connected subgraphs of an RAG). These discriminative graphlets are further integrated by a boosting strategy for scene classification. Experiments on five datasets validate the effectiveness of our GLFI.
Zhang, L, Song, M, Bian, W, Tao, D, Liu, X, Bu, J & Chen, C 2011, 'Feature Relationships Hypergraph for Multimodal Recognition', Lecture Notes in Computer Science. Neural Information Processing. 18th International Conference, ICONIP 2011, International Conference on Neural Information Processing, Springer-Verlag Berlin / Heidelberg, Shanghai, China, pp. 589-598.View/Download from: Publisher's site
Utilizing multimodal features to describe multimedia data is a natural way for accurate pattern recognition. However, how to deal with the complex relationships caused by the tremendous multimodal features and the curse of dimensionality are still two crucial challenges. To solve the two problems, a new multimodal features integration method is proposed. Firstly, a so-called Feature Relationships Hypergraph (FRH) is proposed to model the high-order correlations among the multimodal features. Then, based on FRH, the multimodal features are clustered into a set of low-dimensional partitions. And two types of matrices, the interpartition matrix and intra-partition matrix, are computed to quantify the inter- and intra- partition relationships. Finally, a multi-class boosting strategy is developed to obtain a strong classifier by combining the weak classifiers learned from the intra- partition matrices. The experimental results on different datasets validate the effectiveness of our approach
Li, J, Bian, W, Tao, D & Zhang, C 2011, 'Learning Colours from Textures by Sparse Manifold Embedding', Lecture Notes in Artificial Intelligence.AI 2011: Advances in Artificial Intelligence.24th Australasian Joint Conference, Australasian Joint Conference on Artificial Intelligence, Springer-Verlag Berlin / Heidelberg, Perth, Australia, pp. 600-608.View/Download from: Publisher's site
The capability of inferring colours from the texture (grayscale contents) of an image is useful in many application areas, when the imaging device/environment is limited. Traditional colour assignment involves intensive human effort. Automatic methods have been proposed to establish relations between image textures and the corresponding colours. Existing research mainly focuses on linear relations. In this paper, we employ sparse constraints in the model of texture-colour relationship. The technique is developed on a locally linear model, which assumes manifold assumption of the distribution of the image data. Given the texture of an image patch, learning the model transfers colours to the texture patch by combining known colours of similar texture patches. The sparse constraint checks the contributing factors in the model and helps improve the stability of the colour transfer. Experiments show that our method gives superior results to those of the previous work.
Bian, W, Li, J & Tao, D 2010, 'Feature Extraction For FMRI-based Human Brain Activity Recognition', Machine Learning In Medical Imaging, International Workshop on Machine Learning in Medical Imaging, Springer-Verlag Berlin, Beijing, China, pp. 148-156.View/Download from: Publisher's site
Mitchell et al.  demonstrated that support vector machines (SVM) are effective to classify the cognitive state of a human subject based on fRMI images observed over a single time interval. However, the direct use of classifiers on active voxels veils
Bian, W & Tao, D 2009, 'Dirichlet Mixture Allocation For Multiclass Document Collections Modeling', 2009 9th IEEE International Conference On Data Mining, IEEE International Conference on Data Mining, IEEE, Miami Beach, FL, pp. 711-715.View/Download from: Publisher's site
Topic model, Latent Dirichlet Allocation (LDA), is an effective tool for statistical analysis of large collections of documents. In LDA, each document is modeled as a mixture of topics and the topic proportions are generated from the unimodal Dirichlet d
Bian, W & Tao, D 2009, 'Manifold Regularization for SIR with Rate Root-n Convergence', Proceedings of the 2009 Conference ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 22, Annual Conference on Neural Information Processing Systems, Curran Associates, Inc, Vancouver, British Columbia, Canada, pp. 1-9.
In this paper, we study the manifold regularization for the Sliced Inverse Regression (SIR). The manifold regularization improves the standard SIR in two aspects: 1) it encodes the local geometry for SIR and 2) it enables SIR to deal with transductive and semi-supervised learning problems. We prove that the proposed graph Laplacian based regularization is convergent at rate root-n. The projection directions of the regularized SIR are optimized by using a conjugate gradient method on the Grassmann manifold. Experimental results support our theory.
Bian, W & Tao, D 2009, 'Manifold regularization for SIR with rate root-n convergence', Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference, pp. 117-125.
In this paper, we study the manifold regularization for the Sliced Inverse Regression (SIR). The manifold regularization improves the standard SIR in two aspects: 1) it encodes the local geometry for SIR and 2) it enables SIR to deal with trans-ductive and semi-supervised learning problems. We prove that the proposed graph Laplacian based regularization is convergent at rate root-n. The projection directions of the regularized SIR are optimized by using a conjugate gradient method on the Grassmann manifold. Experimental results support our theory.
Bian, W, Cheng, JL & Tao, D 2009, 'Biased Isomap Projections For Interactive Reranking', ICME: 2009 IEEE International Conference On Multimedia And Expo, Vols 1-3, IEEE International Conference on Multimedia and Expo, IEEE, New York, NY, pp. 1632-1635.View/Download from: Publisher's site
Image search has recently gained more and more attention for various applications. To capture users' intensions and to bridge the gap between the low level visual features and the high level semantics, a dozen of interactive reranking (IR) or relevance f
Bian, W & Tao, D 2008, 'Harmonic mean for subspace selection', Proceedings - International Conference on Pattern Recognition.
Under the homoscedastic Gaussian assumption, it has been shown that Fisher's linear discriminant analysis (FLDA) suffers from the class separation problem when the dimensionality of subspace selected by FLDA is strictly less than the class number minus 1, i.e., the projection to a subspace tends to merge close class pairs. A recent result shows that maximizing the geometric mean of Kullback-Leibler (KL) divergences of class pairs can significantly reduce this problem. In this paper, to further reduce the class separation problem, the harmonic mean is applied to replace the geometric mean for subspace selection. The new method is termed maximization of the harmonic mean of all pairs of symmetric KL divergences (MHMD). As MHMD is invariant to rotational transformations, an efficient optimization procedure can be conducted on the Grassmann manifold. Thorough empirical studies demonstrate the effective of harmonic mean in dealing with the class separation problem. © 2008 IEEE.