Kun received the B.S. and PhD degrees from the School of Information Science and Engineering, Lanzhou University, in 2005 and 2010, respectively. He was a visiting student with Dalhousie University from 2009 to 2010. He is currently with Lanzhou University and is a visiting scholar with the University of Technology Sydney.
His general research interests are in machine learning, neural networks, and optimisation.
Most existing graph-based clustering methods need a predefined graph and their clustering performance highly depends on the quality of the graph. Aiming to improve the multiview clustering performance, a graph learning-based method is proposed to improve the quality of the graph. Initial graphs are learned from data points of different views, and the initial graphs are further optimized with a rank constraint on the Laplacian matrix. Then, these optimized graphs are integrated into a global graph with a well-designed optimization procedure. The global graph is learned by the optimization procedure with the same rank constraint on its Laplacian matrix. Because of the rank constraint, the cluster indicators are obtained directly by the global graph without performing any graph cut technique and the k-means clustering. Experiments are conducted on several benchmark datasets to verify the effectiveness and superiority of the proposed graph learning-based multiview clustering algorithm comparing to the state-of-the-art methods.
Zeng, Z., Li, Z., Cheng, D., Zhang, H., Zhan, K. & Yang, Y. 2018, 'Two-Stream Multi-Rate Recurrent Neural Network for Video-Based Pedestrian Re-Identification', IEEE Transactions on Industrial Informatics.View/Download from: UTS OPUS or Publisher's site
IEEE Video-based pedestrian re-identification is a fundamental task in video surveillance and real-world applications, and has attracted much research attention recently. Its goal is to match pedestrians across multiple non-overlapping network cameras. In this paper we propose a novel two-stream multi-rate recurrent neural network for video-based pedestrian re-identification, which has two inherent benefits: (1) capture the static spatial and temporal information; (2) deal with motion speed variance. Given video sequences of pedestrians, we start with extracting spatial and motion features using two different deep neural networks. Then we combine them using a regularized fusion network, which aims to explore feature correlations. To step further, we feed the two features into a multi-rate recurrent network to exploit the temporal correlations, and more importantly, to take into consideration that pedestrians, sometimes even the same pedestrian, move in different speeds across different camera views. Extensive experiments have conducted on two real-world video-based pedestrian re-identification benchmarks: iLIDS-VID and PRID 2011 datasets. The experimental results confirm the superiority of the proposed method. Our code will be released upon acceptance.
Zhan, K., Chang, X., Guan, J., Chen, L., Ma, Z. & Yang, Y. 2018, 'Adaptive Structure Discovery for Multimedia Analysis Using Multiple Features', IEEE Transactions on Cybernetics.View/Download from: Publisher's site
IEEE Multifeature learning has been a fundamental research problem in multimedia analysis. Most existing multifeature learning methods exploit graph, which must be computed beforehand, as input to uncover data distribution. These methods have two major problems confronted. First, graph construction requires calculating similarity based on nearby data pairs by a fixed function, e.g., the RBF kernel, but the intrinsic correlation among different data pairs varies constantly. Therefore, feature learning based on such predefined graphs may degrade, especially when there is dramatic correlation variation between nearby data pairs. Second, in most existing algorithms, each single-feature graph is computed independently and then combine them for learning, which ignores the correlation between multiple features. In this paper, a new unsupervised multifeature learning method is proposed to make the best utilization of the correlation among different features by jointly optimizing data correlation from multiple features in an adaptive way. As opposed to computing the affinity weight of data pairs by a fixed function, the weight of affinity graph is learned by a well-designed optimization problem. Additionally, the affinity graph of data pairs from different features is optimized in a global level to better leverage the correlation among different channels. In this way, the adaptive approach correlates the features of all features for a better learning process. Experimental results on real-world datasets demonstrate that our approach outperforms the state-of-the-art algorithms on leveraging multiple features for multimedia analysis.
Linking synaptic computation network is proposed. The linking synapse is introduced into the neural network inspired by the gamma band oscillations in visual cortical neurons, and the neural network is applied to image representation. The linking synaptic mechanism of the network allows integrating temporal and spatial information. An image is input to the network and the enhanced result is obtained by the final linking synaptic state. The visual performance of the results boosts the details while preserving the information in the input image. The effectiveness of the method has been borne out by five quantitative metrics as well as qualitative comparisons with other methods.
Zhan, K., Shi, J., Wang, H., Xie, Y. & Li, Q. 2017, 'Computational Mechanisms of Pulse-Coupled Neural Networks: A Comprehensive Review', ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, vol. 24, no. 3, pp. 573-588.View/Download from: UTS OPUS or Publisher's site
Zhan, K., Shi, J., Wang, J. & Tian, F. 2017, 'Graph-regularized concept factorization for multi-view document clustering', Journal of Visual Communication and Image Representation, vol. 48, pp. 411-418.View/Download from: UTS OPUS or Publisher's site
We propose a novel multi-view document clustering method with the graph-regularized concept factorization (MVCF). MVCF makes full use of multi-view features for more comprehensive understanding of the data and learns weights for each view adaptively. It also preserves the local geometrical structure of the manifolds for multi-view clustering. We have derived an efficient optimization algorithm to solve the objective function of MVCF and proven its convergence by utilizing the auxiliary function method. Experiments carried out on three benchmark datasets have demonstrated the effectiveness of MVCF in comparison to several state-of-the-art approaches in terms of accuracy, normalized mutual information and purity.
Zhan, K., Wang, H., Xie, Y., Zhang, C. & Min, Y. 2017, 'Albedo recovery for hyperspectral image classification', Journal of Electronic Imaging, vol. 26, no. 4.View/Download from: UTS OPUS or Publisher's site
Zhan, K., Wei, D., Shi, J. & Yu, J. 2017, 'Cross-utilizing hyperchaotic and DNA sequences for image encryption', JOURNAL OF ELECTRONIC IMAGING, vol. 26, no. 1.View/Download from: UTS OPUS or Publisher's site
© 2016 Elsevier Inc. Hashing is one of the popular solutions for approximate nearest neighbor search because of its low storage cost and fast retrieval speed, and many machine learning algorithms are adapted to learn effective hash function. As hash codes of the same cluster are similar to each other while the hash codes in different clusters are dissimilar, we propose an unsupervised discriminative hashing learning method (UDH) to improve discrimination among hash codes in different clusters. UDH shares a similar objective function with spectral hashing algorithm, and uses a modified graph Laplacian matrix to exploit local discriminant information. In addition, UDH is designed to enable efficient out-of-sample extension. Experiments on real world image datasets demonstrate the effectiveness of our novel approach for image retrieval.
Inspired by gamma-band oscillations and other neurobiological discoveries, neural networks research shifts the emphasis toward temporal coding, which uses explicit times at which spikes occur as an essential dimension in neural representations. We present a feature-linking model (FLM) that uses the timing of spikes to encode information. The first spiking time of FLM is applied to image enhancement, and the processing mechanisms are consistent with the human visual system. The enhancement algorithm achieves boosting the details while preserving the information of the input image. Experiments are conducted to demonstrate the effectiveness of the proposed method. Results show that the proposed method is effective.
Zhan, K., Wang, H., Huang, H. & Xie, Y. 2016, 'Large margin distribution machine for hyperspectral image classification', JOURNAL OF ELECTRONIC IMAGING, vol. 25, no. 6.View/Download from: Publisher's site
Zhan, K., Zhang, H. & Ma, Y. 2009, 'New spiking cortical model for invariant texture retrieval and image processing.', IEEE transactions on neural networks, vol. 20, no. 12, pp. 1980-1986.View/Download from: Publisher's site
Based on the studies of existing local-connected neural network models, in this brief, we present a new spiking cortical neural networks model and find that time matrix of the model can be recognized as a human subjective sense of stimulus intensity. The series of output pulse images of a proposed model represents the segment, edge, and texture features of the original image, and can be calculated based on several efficient measures and forms a sequence as the feature of the original image. We characterize texture images by the sequence for an invariant texture retrieval. The experimental results show that the retrieval scheme is effective in extracting the rotation and scale invariant features. The new model can also obtain good results when it is used in other image processing applications.