Wei received the Ph. D. degree in computer science from University of Technology Sydney (UTS), Australia, 2018, the M. Sc. degree in computer science from Peking University (PKU), China, 2014, and the B. Eng. degree in computer science from Central South University (CSU), China, 2011.
Can supervise: YES
Luo, C, Cai, S, Su, K & Wu, W 2015, 'Clause states based configuration checking in local search for satisfiability.', IEEE transactions on cybernetics, vol. 45, no. 5, pp. 1014-1027.View/Download from: Publisher's site
Two-mode stochastic local search (SLS) and focused random walk (FRW) are the two most influential paradigms of SLS algorithms for the propositional satisfiability (SAT) problem. Recently, an interesting idea called configuration checking (CC) was proposed to handle the cycling problem in SLS. The CC idea has been successfully used to improve SLS algorithms for SAT, resulting in state-of-the-art solvers. Previous CC strategies for SAT are based on neighboring variables, and prove successful in two-mode SLS algorithms. However, this kind of neighboring variables based CC strategy is not suitable for improving FRW algorithms. In this paper, we propose a new CC strategy which is based on clause states. We apply this clause states based CC (CSCC) strategy to both two-mode SLS and FRW paradigms. Our experiments show that the CSCC strategy is effective on both paradigms. Furthermore, our developed FRW algorithms based on CSCC achieve state-of-the-art performance on a broad range of random SAT benchmarks.
Wu, W, Li, B, Chen, L & Zhang, C 2017, 'Consistent Weighted Sampling Made More Practical', Proceedings of the 26th International Conference on World Wide Web, International World Wide Web Conference, ACM DL, Perth, Australia.View/Download from: UTS OPUS or Publisher's site
Min-Hash, which is widely used for efficiently estimating similarities of bag-of-words represented data, plays an increasingly important role in the era of big data. It has been extended to deal with real-value weighted sets -- Improved Consistent Weighted Sampling (ICWS) is considered as the state-of-the-art for this problem. In this paper, we propose a Practical CWS (PCWS) algorithm. We first transform the original form of ICWS into an equivalent expression, based on which we find some interesting properties that inspire us to make the ICWS algorithm simpler and more efficient in both space and time complexities. PCWS is not only mathematically equivalent to ICWS and preserves the same theoretical properties, but also saves 20% memory footprint and substantial computational cost compared to ICWS. The experimental results on a number of real-world text data sets demonstrate that PCWS obtains the same (even better) classification and retrieval performance as ICWS with 1/5~1/3 reduced empirical runtime.
Wu, W, Li, B, Chen, L & Zhang, C 2016, 'Canonical Consistent Weighted Sampling for Real-Value Weighted Min-Hash', Proceedings of the 2016 IEEE 16th International Conference on Data Mining, IEEE International Conference on Data Mining, IEEE, Barcelona, Spain, pp. 1287-1292.View/Download from: UTS OPUS or Publisher's site
Min-Hash, as a member of the Locality Sensitive Hashing (LSH) family for sketching sets, plays an important role in the big data era. It is widely used for efficiently estimating similarities of bag-of-words represented data and has been extended to dealing with multi-sets and real-value weighted sets. Improved Consistent Weighted Sampling (ICWS) has been recognized as the state-of-the-art for real-value weighted Min-Hash. However, the algorithmic implementation of ICWS is flawed because it violates the uniformity of the Min-Hash scheme. In this paper, we propose a Canonical Consistent Weighted Sampling (CCWS) algorithm, which not only retains the same theoretical complexity as ICWS but also strictly complies with the definition of Min-Hash. The experimental results demonstrate that the proposed CCWS algorithm runs faster than the state-of-the-arts while achieving similar classification performance on a number of real-world text data sets.
Wu, W, Li, B, Chen, L & Zhang, C 2016, 'Cross-view feature hashing for image retrieval', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, Auckland, New Zealand, pp. 203-214.View/Download from: UTS OPUS or Publisher's site
© Springer International Publishing Switzerland 2016.Traditional cross-view information retrieval mainly rests on correlating two sets of features in different views. However, features in different views usually have different physical interpretations. It may be inappropriate to map multiple views of data onto a shared feature space and directly compare them. In this paper, we propose a simple yet effective Cross-View Feature Hashing (CVFH) algorithm via a 'partition and match' approach. The feature space for each view is bi-partitioned multiple times using B hash functions and the resulting binary codes for all the views can thus be represented in a compatible B-bit Hamming space. To ensure that hashed feature space is effective for supporting generic machine learning and information retrieval functionalities, the hash functions are learned to satisfy two criteria: (1) the neighbors in the original feature spaces should be also close in the Hamming space; and (2) the binary codes for multiple views of the same sample should be similar in the shared Hamming space. We apply CVFH to cross view image retrieval. The experimental results show that CVFH can outperform the Canonical Component Analysis (CCA) based cross-view method.