Xianzhi Wang is a lecturer at the School of Software in the Faculty of Engineering and Information Technology. He received his PhD and Master's degree from Harbin Institute of Technology, Harbin, China, and Bachelor's degree from Xi'an Jiaotong University, Xi'an, China. His research interests include Internet of Things, data fusion, machine learning, and recommender systems.
Previously, he worked as a research fellow in the Living Analytics Research Centre at Singapore Management University, and as a research associate in the School of Computer Science and Engineering, University of New South Wales and the School of Computer Science, University of Adelaide, respectively. He visited the Department of Computer Science and Engineering at Arizona State University for one year and interned at IBM Research - China for five months.
- ARC Discovery Early Career Researcher Award (DECRA)
- IBM PhD Fellowship Award
- Best Paper Award (1st CCF National Conference on Service Computing)
- CSC Scholarship
- National PhD Scholarship (Ministry of Education of China)
Can supervise: YES
- Internet of Things & Mobile Sensing
- Machine Learning & Artificial Intelligence
- Information Fusion & Privacy Preservation
- Recommender Systems
- Computer Science Studio 1 (Autumn 2019)
Huang, C, Yao, L, Wang, X, Benatallah, B & Sheng, QZ 2019, 'Software Expert Discovery via Knowledge Domain Embeddings in a Collaborative Network', Pattern Recognition Letters.View/Download from: UTS OPUS
CHEN, J, SU, S & WANG, X 2019, 'Towards Privacy-Preserving Location Sharing over Mobile Online Social Networks', IEICE TRANSACTIONS on Information and Systems, vol. 102, pp. 133-146.View/Download from: UTS OPUS
Altulyan, M, Yao, L, Kanhere, SS, Wang, X & Huang, C 2019, 'A unified framework for data integrity protection in people-centric smart cities', Multimedia Tools and Applications.View/Download from: UTS OPUS or Publisher's site
With the rapid increase in urbanisation, the concept of smart cities has attracted considerable attention. By leveraging emerging technologies such as the Internet of Things (IoT), artificial intelligence and cloud computing, smart cities have the potential to improve various indicators of residents' quality of life. However, threats to data integrity may affect the delivery of such benefits, especially in the IoT environment where most devices are inherently dynamic and have limited resources. Prior work has focused on ensuring integrity of data in a piecemeal manner and covering only some parts of the smart city ecosystem. In this paper, we address integrity of data from an end-to-end perspective, i.e., from the data source to the data consumer. We propose a holistic framework for ensuring integrity of data in smart cities that covers the entire data lifecycle. Our framework is founded on three fundamental concepts, namely, secret sharing, fog computing and blockchain. We provide a detailed description of various components of the framework and also utilize smart healthcare as use case.
Yao, L, Sheng, QZ, Wang, X, Wang, S, Li, X & Wang, S 2018, 'Collaborative text categorization via exploiting sparse coefficients', World Wide Web, vol. 21, no. 2, pp. 373-394.View/Download from: UTS OPUS or Publisher's site
© 2017, Springer Science+Business Media New York. Text categorization is widely characterized as a multi-label classification problem. Robust modeling of the semantic similarity between a query text and training texts is essential to construct an effective and accurate classifier. In this paper, we systematically investigate the Web page/text classification problem via integrating sparse representation with random measurements. In particular, we first adopt a very sparse data-independent random measurement matrix to map the original high dimensional text feature space to a lower dimensional space without loss of key information. We then propose a generic sparse representation method to obtain the sparse solution by decoding the semantic correlations between the query text and entire training samples. Based on the above method, we also design and examine a series of rules by taking advantage of the sparse coefficients to propagate multiple labels for the given query texts. We have conducted extensive experiments using real-world datasets to examine our proposed approach, and the results show the effectiveness of the proposed approach.
Yao, L, Sheng, QZ, Li, X, Gu, T, Tan, M, Wang, X, Wang, S & Ruan, W 2018, 'Compressive Representation for Device-Free Activity Recognition with Passive RFID Signal Strength', IEEE Transactions on Mobile Computing, vol. 17, no. 2, pp. 293-306.View/Download from: UTS OPUS or Publisher's site
© 2017 IEEE. Understanding and recognizing human activities is a fundamental research topic for a wide range of important applications such as fall detection and remote health monitoring and intervention. Despite active research in human activity recognition over the past years, existing approaches based on computer vision or wearable sensor technologies present several significant issues such as privacy (e.g., using video camera to monitor the elderly at home) and practicality (e.g., not possible for an older person with dementia to remember wearing devices). In this paper, we present a low-cost, unobtrusive, and robust system that supports independent living of older people. The system interprets what a person is doing by deciphering signal fluctuations using radio-frequency identification (RFID) technology and machine learning algorithms. To deal with noisy, streaming, and unstable RFID signals, we develop a compressive sensing, dictionary-based approach that can learn a set of compact and informative dictionaries of activities using an unsupervised subspace decomposition. In particular, we devise a number of approaches to explore the properties of sparse coefficients of the learned dictionaries for fully utilizing the embodied discriminative information on the activity recognition task. Our approach achieves efficient and robust activity recognition via a more compact and robust representation of activities. Extensive experiments conducted in a real-life residential environment demonstrate that our proposed system offers a good overall performance and shows the promising practical potential to underpin the applications for the independent living of the elderly.
Wang, X, Huang, C, Yao, L, Benatallah, B & Dong, M 2018, 'A Survey on Expert Recommendation in Community Question Answering', Journal of Computer Science and Technology, vol. 33, no. 4, pp. 625-653.View/Download from: UTS OPUS or Publisher's site
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. Community question answering (CQA) represents the type of Web applications where people can exchange knowledge via asking and answering questions. One significant challenge of most real-world CQA systems is the lack of effective matching between questions and the potential good answerers, which adversely affects the efficient knowledge acquisition and circulation. On the one hand, a requester might experience many low-quality answers without receiving a quality response in a brief time; on the other hand, an answerer might face numerous new questions without being able to identify the questions of interest quickly. Under this situation, expert recommendation emerges as a promising technique to address the above issues. Instead of passively waiting for users to browse and find their questions of interest, an expert recommendation method raises the attention of users to the appropriate questions actively and promptly. The past few years have witnessed considerable efforts that address the expert recommendation problem from different perspectives. These methods all have their issues that need to be resolved before the advantages of expert recommendation can be fully embraced. In this survey, we first present an overview of the research efforts and state-of-the-art techniques for the expert recommendation in CQA. We next summarize and compare the existing methods concerning their advantages and shortcomings, followed by discussing the open issues and future research directions.
Xu, X, Motta, G, Tu, Z, Xu, H, Wang, Z & Wang, X 2018, 'A new paradigm of software service engineering in big data and big service era', Computing, vol. 100, no. 4, pp. 353-368.View/Download from: UTS OPUS or Publisher's site
© 2018, Springer-Verlag GmbH Austria, part of Springer Nature. In the big data era, servitization becomes one of the important development trends of the IT world. More and more software resources are developed and existed in the format as services on the Internet. These services from multi-domains and multi-networks are converged as a huge complicated service network or ecosystem, which can be called as Big Service. How to reuse the abundant open service resources to rapidly develop the new applications or comprehensive service solutions to meet massive individualized customer requirements is a key issue in the big data and big service ecosystem. Based on analyzing the ecosystem of big service, this paper presents a new paradigm of software service engineering, Requirement-Engineering Two-Phase of Service Engineering Paradigm (RE2SEP), which includes service oriented requirement engineering, domain oriented service engineering, and the development approach of software services. By means of the RE2SEP approach, the adaptive service solutions can be efficiently designed and implemented to match the requirement propositions of massive individualized customers in Big Service ecosystem. A case study of the RE2SEP applications, which is a project on citizens mobility service in smart city environment, is also given in this paper. The RE2SEP paradigm will change the way of traditional life-cycle oriented software engineering, and lead a new approach of software service engineering.
Yao, L, Sheng, QZ, Benatallah, B, Dustdar, S, Wang, X, Shemshadi, A & Kanhere, SS 2018, 'WITS: an IoT-endowed computational framework for activity recognition in personalized smart homes', Computing, vol. 100, no. 4, pp. 369-385.View/Download from: Publisher's site
© 2018, Springer-Verlag GmbH Austria, part of Springer Nature. Over the past few years, activity recognition techniques have attracted unprecedented attentions. Along with the recent prevalence of pervasive e-Health in various applications such as smart homes, automatic activity recognition is being implemented increasingly for rehabilitation systems, chronic disease management, and monitoring the elderly for their personal well-being. In this paper, we present WITS, an end-to-end web-based in-home monitoring system for convenient and efficient care delivery. The system unifies the data- and knowledge-driven techniques to enable a real-time multi-level activity monitoring in a personalized smart home. The core components consist of a novel shared-structure dictionary learning approach combined with rule-based reasoning for continuous daily activity tracking and abnormal activities detection. WITS also exploits an Internet of Things middleware for the scalable and seamless management and learning of the information produced by ambient sensors. We further develop a user-friendly interface, which runs on both iOS and Andriod, as well as in Chrome, for the efficient customization of WITS monitoring services without programming efforts. This paper presents the architectural design of WITS, the core algorithms, along with our solutions to the technical challenges in the system implementation.
Yao, L, Wang, X, Sheng, QZ, Benatallah, B & Huang, C 2018, 'Mashup Recommendation by Regularizing Matrix Factorization with API Co-Invocations', IEEE Transactions on Services Computing.View/Download from: Publisher's site
IEEE Mashup is a dominant approach for building data-centric applications, especially mobile applications, in recent years. Since mashups are predominantly based on public data sources and existing APIs, it requires no sophisticated programming knowledge of people to develop mashup applications. The recent prevalence of open APIs and open data sources in the Big Data era has provided new opportunities for mashup development, but at the same time increase the difficulty of selecting the right services for a given mashup task. The API recommendation for mashup differs from traditional service recommendation tasks in lacking the specific QoS information and formal semantic specification of the APIs, which limits the adoption of many existing methods. Although there are a significant number of service recommendation approaches, most of them focus on improving the recommendation accuracy and few work pays attention to the diversity of the recommendation results. Another challenge comes from the existence of both explicit and implicit correlations among the different APIs generally neglected by existing recommendation methods. In this paper, we address the above deficiencies of existing approaches by exploring API recommendation for mashups in the reusable composition context, with the goal of helping developers identify the most appropriate APIs for composition task
Yao, L, Sheng, QZ, Wang, X, Zhang, WE & Qin, Y 2018, 'Collaborative location recommendation by integrating multi-dimensional contextual information', ACM Transactions on Internet Technology, vol. 18, no. 3.View/Download from: Publisher's site
© 2018 ACM. Point-of-Interest (POI) recommendation is a new type of recommendation task that comes along with the prevalence of location-based social networks and services in recent years. Compared with traditional recommendation tasks, POI recommendation focuses more on making personalized and context-aware recommendations to improve user experience. Traditionally, the most commonly used contextual information includes geographical and social context information. However, the increasing availability of check-in data makes it possible to design more effective location recommendation applications by modeling and integrating comprehensive types of contextual information, especially the temporal information. In this article, we propose a collaborative filtering method based on Tensor Factorization, a generalization of the Matrix Factorization approach, tomodel themulti-dimensional contextual information. Tensor Factorization naturally extendsMatrix Factorization by increasing the dimensionality of concerns, within which the three-dimensional model is the one most popularly used. Our method exploits a high-order tensor to fuse heterogeneous contextual information about users' check-ins instead of the traditional two-dimensional user-location matrix. The factorization of this tensor leads to a more compact model of the data that is naturally suitable for integrating contextual information to make POI recommendations. Based on the model, we further improve the recommendation accuracy by utilizing the internal relations within users and locations to regularize the latent factors. Experimental results on a large real-world dataset demonstrate the effectiveness of our approach.
© 2018, Springer Science+Business Media, LLC, part of Springer Nature. In the era of Big Data, truth discovery has emerged as a fundamental research topic, which estimates data veracity by determining the reliability of multiple, often conflicting data sources. Although considerable research efforts have been conducted on this topic, most current approaches assume only one true value for each object. In reality, objects with multiple true values widely exist and the existing approaches that cope with multi-valued objects still lack accuracy. In this paper, we propose a full-fledged graph-based model, SmartVote, which models two types of source relations with additional quantification to precisely estimate source reliability for effective multi-valued truth discovery. Two graphs are constructed and further used to derive different aspects of source reliability (i.e., positive precision and negative precision) via random walk computations. Our model incorporates four important implications, including two types of source relations, object popularity, loose mutual exclusion, and long-tail phenomenon on source coverage, to pursue better accuracy in truth discovery. Empirical studies on two large real-world datasets demonstrate the effectiveness of our approach.
Ning, X, Yac, L, Wang, X, Benatallah, B, Dong, M & Zhang, S 2018, 'Rating prediction via generative convolutional neural networks based regression', Pattern Recognition Letters.View/Download from: Publisher's site
Ratings are an essential criterion for evaluating the quality of movies and a critical indicator of whether a customer would watch a movie. Therefore, an important related research challenge is to predict the rating of a movie before it is released in cinema or even before it is produced. Many existing approaches fail to address this challenge because they predict movie ratings based on post-production factors such as review comments from social media. Consequently, they are generally inapplicable until a movie has been released for a certain period of time when a sufficient number of review comments have become available. In this paper, we propose a regression model based on generative convolutional neural networks for movie rating prediction. Instead of post-production factors widely used by previous work, this model learns from movies' intrinsic pillars such as genres, budget, cast, director and plot information, which are obtainable before the production of movies. In particular, the model explores the correlations between the rating of a movie and its intrinsic attributes to predict its rating. The results can serve as a reference for investors and movie studios to determine an optimal portfolio for movie production and a guidance to the interested users to choose the movie to watch. Extensive experiments on a real dataset are benchmarked against a set of baselines and state of the art approaches. The results demonstrate the effectiveness of our approach. The proposed model is also general to be extended to handle other prediction tasks.
© 2018 Online reviews play an important role in influencing buyers' daily purchase decisions. However, fake and meaningless reviews, which cannot reflect users' genuine purchase experience and opinions, widely exist on the Web and pose great challenges for users to make right choices. Therefore, it is desirable to build a fair model that evaluates the quality of products by distinguishing spamming reviews. We present an end-to-end trainable unified model to leverage the appealing properties from Autoencoder and random forest. A stochastic decision tree model is implemented to guide the global parameter learning process. Extensive experiments were conducted on a large Amazon review dataset. The proposed model consistently outperforms a series of compared methods.
Chen, J, Tian, Z, Cui, X, Yin, L & Wang, X 2018, 'Trust architecture and reputation evaluation for internet of things', Journal of Ambient Intelligence and Humanized Computing, pp. 1-9.
Wang, X, Xu, X, Sheng, QZ, Wang, Z & Yao, L 2018, 'Novel Artificial Bee Colony Algorithms for QoS-Aware Service Selection', IEEE Transactions on Services Computing, pp. 1-1.View/Download from: Publisher's site
Xu, X, Liu, Z, Wang, Z, Sheng, QZ, Yu, J & Wang, X 2017, 'S-ABC: A paradigm of service domain-oriented artificial bee colony algorithms for service selection and composition', Future Generation Computer Systems, vol. 68, pp. 304-319.View/Download from: Publisher's site
© 2016 Elsevier B.V. With the rapid development of Cloud Computing, Big Data, Social Networks, and the Internet of Things, typical service optimization problems (SOPs) such as service selection, service composition and service resource scheduling in the service computing field have become more and more complicated due to the constant enrichment and dynamic aggregation of large number of services, as well as the unceasing variation of user requirements. Meanwhile, with the long-term development and evolution of business in many application domains, some service domain features (such as priori, correlation and similarity) are usually formed, which have strong influences on solving SOPs. Unfortunately, the existing research efforts on SOPs primarily concentrate on designing general algorithms for specific problems without considering the service domain features. This often leads to undesirable results of SOPs. Therefore, how to design a paradigm of service domain-oriented optimization algorithms with service domain features becomes a challenge for providing optimization strategies and algorithms to solve SOPs effectively. By considering the influences of service domain features on solving SOPs, this paper proposes a set of service domain-oriented artificial bee colony algorithms (S-ABC) based on the optimization mechanism of Artificial Bee Colony (ABC) method. Furthermore, by configuring the items and parameters of the S-ABC paradigm in detail, optimization algorithms for particular SOPs (e.g., service selection and composition) could be derived. In this paper, the superiority of our proposed S-ABC is verified through solving concurrent service selection and service composition problem. By exploiting the artificial bee colony algorithms for the optimization problems in service domains, this work makes novel contributions for solving SOPs, as well as extends the theory of the swarm intelligence optimization.
Purpose This paper aims to propose a system for generating actionable knowledge from Big Data and use this system to construct a comprehensive knowledge base (KB), called GrandBase. Design/methodology/approach In particular, this study extracts new predicates from four types of data sources, namely, Web texts, Document Object Model (DOM) trees, existing KBs and query stream to augment the ontology of the existing KB (i.e. Freebase). In addition, a graph-based approach to conduct better truth discovery for multi-valued predicates is also proposed. Findings Empirical studies demonstrate the effectiveness of the approaches presented in this study and the potential of GrandBase. The future research directions regarding GrandBase construction and extension has also been discussed. Originality/value To revolutionize our modern society by using the wisdom of Big Data, considerable KBs have been constructed to feed the massive knowledge-driven applications with Resource Description Framework triples. The important challenges for KB construction include extracting information from large-scale, possibly conflicting and different-structured data sources (i.e. the knowledge extraction problem) and reconciling the conflicts that reside in the sources (i.e. the truth discovery problem). Tremendous research efforts have been contributed on both problems. However, the existing KBs are far from being comprehensive and accurate: first, existing knowledge extraction systems retrieve data from limited types of Web sources; second, existing truth discovery approaches commonly assume each predicate has only one true value. In this paper, the focus is on the problem of generating actionable knowledge from Big Data. A system is proposed, which consists of two phases, namely, knowledge extraction and truth discovery, to construct a broader KB, called GrandBase.
Wang, X, Wang, Z & Xu, X 2012, 'Effective service composition in large scale service market: An empirical evidence enhanced approach', International Journal of Web Services Research, vol. 9, no. 1, pp. 74-94.View/Download from: Publisher's site
The web has undergone a tremendous shift from information repository to the provisioning capacity of services. As an effective means of constructing coarse-grained solutions by dynamically aggregating a set of services to satisfy complex requirements, traditional service composition suffers from dramatic decrease on the efficiency of determining the optimal solution when large scale services are available in the Internet based service market. Most current approaches look for the optimal composition solution by real-time computation, and the composition efficiency greatly depends on the adopted algorithms. To eliminate such deficiency, this paper proposes a semi-empirical composition approach which incorporates the extraction of empirical evidence from historical experiences to provide guidance to solution space reduction to real-time service selection. Service communities and historical requirements are further organized into clusters based on similarity measurement, and then the probabilistic correspondences between the two types of clusters are identified by statistical analysis. For each new request, its hosting requirement cluster would be identified and corresponding service clusters would be determined by leveraging Bayesian inference. Concrete services would be selected from the reduced solution space to constitute the final composition. Timing strategies for re-clustering and consideration to special cases in clustering ensures continual adaption of the approach to changing environment. Instead of relying solely on pure real-time computation, the approach distinguishes from traditional methods by combining the two perspectives together. Copyright © 2012 IGI Global.
Chu, DH, Wang, XZ, Wang, ZJ & Xu, XF 2011, 'Personalized requirement oriented virtual service resource aggregation method', Jisuanji Xuebao/Chinese Journal of Computers, vol. 34, no. 12, pp. 2370-2380.View/Download from: Publisher's site
Personalization and composition are two important features of modern service eco-systems nowadays. A personalized requirements oriented virtual resource aggregation method is proposed in this paper. Service resources are formally described based on multi-dimensioned classification. Personalized requirements from applications are classified, reduced and finally expressed in formal and reusable manners. On this basis, a dynamic pruning based resource aggregation method is presented. The method pays attention to the features of both customer requirements and the organization of service resources. Multiple resources can be dynamically aggregated into coarse-grained virtual resources thus satisfy the requirements rapidly. Experiments show fairly good effectiveness and efficiency of the proposed method.
Wang, X, Wang, Z, Xu, X & Liu, Y 2011, 'A service composition method for tradeoff between satisfactions of multiple requirements', Jisuanji Yanjiu yu Fazhan/Computer Research and Development, vol. 48, no. 4, pp. 627-637.
Service composition is effective in constructing value-added service rapidly for service-oriented applications. Existing selection models for composite services rely severely on assumptions that customers' each requirement is raised alone, while in reality, service requirements can be numerous in practical applications. And when a small time slide is focused on, multiple requirements can be seen as concurrent and service sets involved by sub-solutions corresponding to each individual requirement have intersections, resulting in competitions or sharing of certain services between requirements. Therefore, current single requirement-oriented methods can not deal with the situation that multiple service requirements arrive concurrently competing for services. This paper presents a multiple service requirements-oriented service composition model and algorithm. In the light that service can either be exclusive or sharable and decisive priority relations exist between all assessment factors, an assessment method based on confliction-avoidance scheduling and graded weighting priorities is put forward. On that basis, tradeoff strategies are proposed for genetic algorithm and a service composition method is put forward for tradeoff between satisfactions of multiple service requirements. Experiment results show that this method ensures proportionality of all sub-solutions and sub-optimal solutions can be gained efficiently by improving its coding manner. Compared with other possible strategies, it has proved superior applicability to different circumstances of quantity and quality of available services.
Wang, XZ, Xu, XF & Wang, ZJ 2010, 'A profit optimization oriented service selection method for dynamic service composition', Jisuanji Xuebao/Chinese Journal of Computers, vol. 33, no. 11, pp. 2104-2115.View/Download from: Publisher's site
Service composition is an effective means of building value-added service in service-oriented computing environment. Current research focuses on the fulfillment of customer value, while neglects the value procurable by service broker, which is the compositor of individual services as well as the provider of composite services. On the one hand, over-optimized service quality will not bring additional profit to the service provider as well as no remarkable improvement to customer satisfaction, thus is unnecessary for the value of both sides of service participants in SLA environment; on the other hand, due to the uncertainty of both services and the environment for delivering services, real quality of service-oriented applications exhibits as uncertain, too. So real services may not meet the quality requirement of negotiated service level, or even fail. Profit and service strategies are studied for SLA, and a novel service selection model is proposed for profit optimization. Based on periodical estimation of service cost and instant feedbacks, service requirements are greedily scheduled and optimized service selection is realized for dynamic service composition based on simulated annealing algorithm. Experimental results show that this approach does not only promote the profit of composite services, but also have superior efficiency in procuring optimized results under different circumstances of requirements distribution, compared with traditional approaches.
Yao, L, Sheng, QZ, Ngu, AHH, Li, X, Benatallah, B & Wang, X 2017, 'Building Entity Graphs for the Web of Things Management' in Managing the Web of Things: Linking the Real World to the Web, pp. 275-303.View/Download from: Publisher's site
© International Federation for Information Processing 2013. Traditional service composition approaches face the significant challenge of how to deal with massive individualized requirements. Such challenges include how to reach a tradeoff between one generalized solution and multiple customized ones and how to balance the costs and benefits of a composition solution(s). Service network is a feasible method to cope with these challenges by interconnecting distributed services to form a dynamic network that operates as a persistent infrastructure, and satisfies the massive individualized requirements of many customers. When a requirement arrives, the service network is dynamically customized and transformed into a specific composite solution. In such way, mass requirements are fulfilled cost-effectively. The conceptual architecture and the mechanisms of facilitating mass customization are presented in this paper, and a competency assessment framework is proposed to evaluate its mass customization and cost-effectiveness capacities
Bai, L, Yao, L, Kanhere, SS, Wang, X & Yang, Z 2019, 'Automatic Device Classification from Network Traffic Streams of Internet of Things', Proceedings - Conference on Local Computer Networks, LCN, pp. 597-605.View/Download from: UTS OPUS or Publisher's site
© 2018 IEEE. With the widespread adoption of Internet of Things (IoT), billions of everyday objects are being connected to the Internet. Effective management of these devices to support reliable, secure and high quality applications becomes challenging due to the scale. As one of the key cornerstones of IoT device management, automatic cross-device classification aims to identify the semantic type of a device by analyzing its network traffic. It has the potential to underpin a broad range of novel features such as enhanced security (by imposing the appropriate rules for constraining the communications of certain types of devices) or context-awareness (by the utilization and interoperability of IoT devices and their high-level semantics) of IoT applications. We propose an automatic IoT device classification method to identify new and unseen devices. The method uses the rich information carried by the traffic flows of IoT networks to characterize the attributes of various devices. We first specify a set of discriminating features from raw network traffic flows, and then propose a LSTM-CNN cascade model to automatically identify the semantic type of a device. Our experimental results using a real-world IoT dataset demonstrate that our proposed method is capable of delivering satisfactory performance. We also present interesting insights and discuss the potential extensions and applications.
Wang, H, Chen, J, Wang, X, Liu, X & Na, Z 2018, 'Privacy protection for location sharing services in social networks', Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, pp. 97-102.View/Download from: Publisher's site
© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2018. Recently, there is an increase interest in location sharing services in social networks. Behind the convenience brought by location sharing, there comes an indispensable security risk of privacy. Though many efforts have been made to protect user's privacy for location sharing, they are not suitable for social network. Most importantly, little research so far can support user relationship privacy and identity privacy. Thus, we propose a new privacy protection protocol for location sharing in social networks. Different from previous work, the proposed protocol can provide perfect privacy for location sharing services. Simulation results validate the feasibility and efficiency of the proposed protocol.
Chen, J, Lin, Z, Liu, X, Deng, Z & Wang, X 2018, 'Reputation-based framework for internet of things', Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, pp. 592-597.View/Download from: Publisher's site
© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2018. Internet of Things (IoT) is going to create a world where physical objects are integrated into traditional networks in order to provide intelligent services for human-beings. Trust plays an important role in communications and interactions of objects in IoT. Two vital tasks of trust management are trust model design and reputation evaluation. However, current literature cannot be simply and directly applied to the IoT due to smart node hardware constraints, very limited computing and energy resources. Therefore a general and flexible model is needed to meet the special requirements for IoT. In this paper, we firstly design LTrust, a layered trust model for IoT. Then, a Reputation Evaluation Scheme for the Node (RES-N) has been presented. The proposed trust model and reputation evaluation scheme provide a general framework for the study of trust management for IoT. The efficiency of RES-N is validated by the simulation results.
Dong, M, Yao, L, Wang, X, Benatallah, B, Sheng, QZ & Huang, H 2018, 'DUAL: A Deep Unified Attention Model with Latent Relation Representations for Fake News Detection', Web Information Systems Engineering – WISE 2018, Springer International Publishing, pp. 199-209.
The prevalence of online social media has enabled news to spread wider and faster than traditional publication channels. The easiness of creating and spreading the news, however, has also facilitated the massive generation and dissemination of fake news. It, therefore, becomes especially important to detect fake news so as to minimize its adverse impact such as misleading people. Despite active efforts to address this issue, most existing works focus on mining news' content or context information from individuals but neglect the use of clues from multiple resources. In this paper, we consider clues from both news' content and side information and propose a hybrid attention model to leverage these clues. In particular, we use an attention-based bi-directional Gated Recurrent Units (GRU) to extract features from news content and a deep model to extract hidden representations of the side information. We combine the two hidden vectors resulted from the above extractions into an attention matrix and learn an attention distribution over the vectors. Finally, the distribution is used to facilitate better fake news detection. Our experimental results on two real-world benchmark datasets show our approach outperforms multiple baselines in the accuracy of detecting fake news.
Ning, X, Yao, L, Wang, X, Benatallah, B, Zhang, S & Zhang, X 2018, 'Data-Augmented Regression with Generative Convolutional Network', Web Information Systems Engineering – WISE 2018, Springer International Publishing, pp. 301-311.View/Download from: UTS OPUS
Generative adversarial networks (GAN)-based approaches have been extensively investigated whereas GAN-inspired regression (i.e., numeric prediction) has rarely been studied in image and video processing domains. The lack of sufficient labeled data in many real-world cases poses great challenges to regression methods, which generally require sufficient labeled samples for their training. In this regard, we propose a unified framework that combines a robust autoencoder and a generative convolutional neural network (GCNN)-based regression model to address the regression problem. Our model is able to generate high-quality artificial samples via augmenting the size of a small number of training samples for better training effects. Extensive experiments are conducted on two real-world datasets and the results show that our proposed model consistently outperforms a set of advanced techniques under various evaluation metrics.
Huang, C, Yao, L, Wang, X, Benatallah, B, Zhang, S & Dong, M 2018, 'Expert recommendation via tensor factorization with regularizing hierarchical topical relationships', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 373-387.View/Download from: UTS OPUS or Publisher's site
© Springer Nature Switzerland AG 2018. Knowledge acquisition and exchange are generally crucial yet costly for both businesses and individuals, especially when the knowledge concerns various areas. Question Answering Communities offer an opportunity for sharing knowledge at a low cost, where communities users, many of whom are domain experts, can potentially provide high-quality solutions to a given problem. In this paper, we propose a framework for finding experts across multiple collaborative networks. We employ the recent techniques of tree-guided learning (via tensor decomposition), and matrix factorization to explore user expertise from past voted posts. Tensor decomposition enables to leverage the latent expertise of users, and the posts and related tags help identify the related areas. The final result is an expertise score for every user on every knowledge area. We experiment on Stack Exchange Networks, a set of question answering websites on different topics with a huge group of users and posts. Experiments show our proposed approach produces steady and premium outputs.
Chen, K, Yao, L, Wang, X, Zhang, D, Gu, T, Yu, Z & Yang, Z 2018, 'Interpretable Parallel Recurrent Neural Networks with Convolutional Attentions for Multi-Modality Activity Modeling', Proceedings of the International Joint Conference on Neural Networks.View/Download from: Publisher's site
© 2018 IEEE. Multimodal features play a key role in wearable sensor based human activity recognition (HAR). Selecting the most salient features adaptively is a promising way to maximize the effectiveness of multimodal sensor data. In this regard, we propose a 'collect fully and select wisely' principle as well as an interpretable parallel recurrent model with convolutional attentions to improve the recognition performance. We first collect modality features and the relations between each pair of features to generate activity frames, and then introduce an attention mechanism to select the most prominent regions from activity frames precisely. The selected frames not only maximize the utilization of valid features but also reduce the number of features to be computed effectively. We further analyze the accuracy and interpretability of the proposed model based on extensive experiments. The results show that our model achieves competitive performance on two benchmarked datasets and works well in real life scenarios.
Ning, X, Yao, L, Wang, X, Benatallah, B, Salim, F & Haghighi, PD 2018, 'Predicting Citywide Passenger Demand via Reinforcement Learning from Spatio-Temporal Dynamics', Proceedings of the 15th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, ACM, pp. 19-28.View/Download from: UTS OPUS
Zhang, X, Yao, L, Zhang, D, Wang, X, Sheng, QZ & Gu, T 2017, 'Multi-person brain activity recognition via comprehensive EEG signal analysis', ACM International Conference Proceeding Series, pp. 28-37.View/Download from: Publisher's site
© 2017 Association for Computing Machinery. An electroencephalography (EEG) based brain activity recognition is a fundamental field of study for a number of significant applications such as intention prediction, appliance control, and neurological disease diagnosis in smart home and smart healthcare domains. Existing techniques mostly focus on binary brain activity recognition for a single person, which limits their deployment in wider and complex practical scenarios. Therefore, multi-person and multi-class brain activity recognition has obtained popularity recently. Another challenge faced by brain activity recognition is the low recognition accuracy due to the massive noises and the low signal-to-noise ratio in EEG signals. Moreover, the feature engineering in EEG processing is time-consuming and highly relies on the expert experience. In this paper, we attempt to solve the above challenges by proposing an approach which has better EEG interpretation ability via raw Electroencephalography (EEG) signal analysis for multi-person and multi-class brain activity recognition. Specifically, we analyze inter-class and inter-person EEG signal characteristics, based on which to capture the discrepancy of inter-class EEG data. Then, we adopt an Autoencoder layer to automatically refine the raw EEG signals by eliminating various artifacts. We evaluate our approach on both a public and a local EEG datasets and conduct extensive experiments to explore the effect of several factors (such as normalization methods, training data size, and Autoencoder hidden neuron size) on the recognition results. The experimental results show that our approach achieves a high accuracy comparing to competitive state-of-the-art methods, indicating its potential in promoting future research on multi-person EEG recognition.
Zhang, X, Yao, L, Huang, C, Sheng, QZ & Wang, X 2017, 'Intent Recognition in Smart Living Through Deep Recurrent Neural Networks', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 748-758.View/Download from: Publisher's site
© 2017, Springer International Publishing AG. Electroencephalography (EEG) signal based intent recognition has recently attracted much attention in both academia and industries, due to helping the elderly or motor-disabled people controlling smart devices to communicate with outer world. However, the utilization of EEG signals is challenged by low accuracy, arduous and time-consuming feature extraction. This paper proposes a 7-layer deep learning model to classify raw EEG signals with the aim of recognizing subjects' intents, to avoid the time consumed in pre-processing and feature extraction. The hyper-parameters are selected by an Orthogonal Array experiment method for efficiency. Our model is applied to an open EEG dataset provided by PhysioNet and achieves the accuracy of 0.9553 on the intent recognition. The applicability of our proposed model is further demonstrated by two use cases of smart living (assisted living with robotics and home automation).
Ning, X, Yao, L, Wang, X & Benatallah, B 2017, 'Calling for response: Automatically distinguishing situation-aware tweets during crises', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 195-208.View/Download from: Publisher's site
© Springer International Publishing AG 2017. Recent years have witnessed the prevalence and use of social media during crises, such as Twitter, which has been becoming a valuable information source for offering better responses to crisis and emergency situations by the authorities. However, the sheer amount of information of tweets can't be directly used. In such context, distinguishing the most important and informative tweets is crucial to enhance emergency situation awareness. In this paper, we design a convolutional neural network based model to automatically detect crisis-related tweets. We explore the twitter-specific linguistic, sentimental and emotional analysis along with statistical topic modeling to identify a set of quality features. We then incorporate them to into a convolutional neural network model to identify crisis-related tweets. Experiments on real-world Twitter dataset demonstrate the effectiveness of our proposed model.
Fang, XS, Sheng, QZ, Wang, X, Barhamgi, M, Yao, L & Ngu, AHH 2017, 'SourceVote: Fusing multi-valued data via inter-source agreements', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 164-172.View/Download from: Publisher's site
© Springer International Publishing AG 2017. Data fusion is a fundamental research problem of identifying true values of data items of interest from conflicting multi-sourced data. Although considerable research efforts have been conducted on this topic, existing approaches generally assume every data item has exactly one true value, which fails to reflect the real world where data items with multiple true values widely exist. In this paper, we propose a novel approach, SourceVote, to estimate value veracity for multi-valued data items. SourceVote models the endorsement relations among sources by quantifying their two-sided inter-source agreements. In particular, two graphs are constructed to model inter-source relations. Then two aspects of source reliability are derived from these graphs and are used for estimating value veracity and initializing existing data fusion methods. Empirical studies on two large real-world datasets demonstrate the effectiveness of our approach.
Huang, C, Yao, L, Wang, X, Benatallah, B & Sheng, QZ 2017, 'Expert as a Service: Software Expert Recommendation via Knowledge Domain Embeddings in Stack Overflow', Proceedings - 2017 IEEE 24th International Conference on Web Services, ICWS 2017, pp. 317-324.View/Download from: Publisher's site
© 2017 IEEE. Question answering (Q&A) communities have gained momentum recently as an effective means of knowledge sharing over the crowds, where many users are experts in the real-world and can make quality contributions in certain domains or technologies. Although the massive user-generated Q&A data present a valuable source of human knowledge, a related challenging issue is how to find those expert users effectively. In this paper, we propose a framework for finding such experts in a collaborative network. Accredited with recent works on distributed word representations, we are able to summarize text chunks from the semantics perspective and infer knowledge domains by clustering pre-trained word vectors. In particular, we exploit a graph-based clustering method for knowledge domain extraction and discern the shared latent factors using matrix factorization techniques. The proposed clustering method features requiring no post-processing of clustering indicators and the matrix factorization method is combined with the semantic similarity of the historical answers to conduct expertise ranking of users given a query. We use Stack Overflow, a website with a large group of users and a large number of posts on topics related to computer programming, to evaluate the proposed approach and conduct extensively experiments to show the effectiveness of our approach.
Salama, U, Yao, L, Wang, X, Paik, HY & Beheshti, A 2017, 'Multi-Level Privacy-Preserving Access Control as a Service for Personal Healthcare Monitoring', Proceedings - 2017 IEEE 24th International Conference on Web Services, ICWS 2017, pp. 878-881.View/Download from: Publisher's site
© 2017 IEEE. The Internet of Things (IoT) changes many sectors of our lives. In the healthcare domain, IoT presents as mobile medical applications over various sensors that update healthcare professionals on patients' health information. However, IoT-based healthcare systems also face major challenges in protecting patients' privacy via an effective access control system. This paper presents an ambient home solution framework for privacy-preserving monitoring of patients' health status. We focus on two major points: 1) how to use the data collected from ambient and biometric sensors, to perform the high-level task of activity recognition, and 2) how to secure the collected healthcare data via effective access control. We achieve multi-level access control by using Public Key Infrastructure (PKI) for authentication and Attribute-Based Access Control (ABAC) for authorisation. Our access control system regulates access to healthcare data by classification over healthcare professionals and data. Our system provides guidelines to define data classes and healthcare professional groups and specifies security policies to control access to the data classes. The system is flexible and can incorporate more policy rules, professionals, and data classes.
Fang, XS, Sheng, QZ, Wang, X & Ngu, AHH 2017, 'Value Veracity Estimation for Multi-Truth Objects via a Graph-Based Approach', Proceedings of the 26th International Conference on World Wide Web Companion, International World Wide Web Conferences Steering Committee, pp. 777-778.View/Download from: Publisher's site
Fang, XS, Sheng, QZ & Wang, X 2016, 'An ensemble approach for better truth discovery', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 298-311.View/Download from: Publisher's site
© Springer International Publishing AG 2016. Truth discovery is a hot research topic in the Big Data era, with the goal of identifying true values from the conflicting data provided by multiple sources on the same data items. Previously, many methods have been proposed to tackle this issue. However, none of the existing methods is a clear winner that consistently outperforms the others due to the varied characteristics of different methods. In addition, in some cases, an improved method may not even beat its original version as a result of the bias introduced by limited ground truths or different features of the applied datasets. To realize an approach that achieves better and robust overall performance, we propose to fully leverage the advantages of existing methods by extracting truth from the prediction results of these existing truth discovery methods. In particular, we first distinguish between the single-truth and multi-truth discovery problems and formally define the ensemble truth discovery problem. Then, we analyze the feasibility of the ensemble approach, and derive two models, i.e., serial model and parallel model, to implement the approach, and to further tackle the above two types of truth discovery problems. Extensive experiments over three large real-world datasets and various synthetic datasets demonstrate the effectiveness of our approach.
Wang, X, Sheng, QZ, Yao, L, Li, X, Fang, XS, Xu, X & Benatallah, B 2016, 'Empowering truth discovery with multi-truth prediction', International Conference on Information and Knowledge Management, Proceedings, pp. 881-890.View/Download from: Publisher's site
© 2016 ACM. Truth discovery is the problem of detecting true values from the conflicting data provided by multiple sources on the same data items. Since sources' reliability is unknown a priori, a truth discovery method usually estimates sources' reliability along with the truth discovery process. A major limitation of existing truth discovery methods is that they commonly assume exactly one true value on each data item and therefore cannot deal with the more general case that a data item may have multiple true values (or multi-truth). Since the number of true values may vary from data item to data item, this requires truth discovery methods being able to detect varying numbers of truth values from the multi-source data. In this paper, we propose a multi-truth discovery approach, which addresses the above challenges by providing a generic framework for enhancing existing truth discovery methods. In particular, we redeem the numbers of true values as an important clue for facilitating multi-truth discovery. We present the procedure and components of our approach, and propose three models, namely the byproduct model, the joint model, and the synthesis model to implement our approach. We further propose two extensions to enhance our approach, by leveraging the implications of similar numerical values and values' co-occurrence information in sources' claims to improve the truth discovery accuracy. Experimental studies on real-world datasets demonstrate the effectiveness of our approach.
Wang, X, Sheng, QZ, Yao, L, Li, X, Fang, XS, Xu, X & Benatallah, B 2016, 'Truth discovery via exploiting implications from multi-source data', International Conference on Information and Knowledge Management, Proceedings, pp. 861-870.View/Download from: Publisher's site
© 2016 ACM. Data veracity is a grand challenge for various tasks on the Web. Since the web data sources are inherently unreliable and may provide conflicting information about the same real-world entities, truth discovery is emerging as a counter-measure of resolving the conflicts by discovering the truth, which conforms to the reality, from the multi-source data. A major challenge related to truth discovery is that different data items may have varying numbers of true values (or multi-truth), which counters the assumption of existing truth discovery methods that each data item should have exactly one true value. In this paper, we address this challenge by exploiting and leveraging the implications from multi-source data. In particular, we exploit three types of implications, namely the implicit negative claims, the distribution of positive/negative claims, and the co-occurrence of values in sources' claims, to facilitate multi-truth discovery. We propose a probabilistic approach with improvement measures that incorporate the three implications in all stages of truth discovery process. In particular, incorporating the negative claims enables multi-truth discovery, considering the distribution of positive/negative claims relieves truth discovery from the impact of sources' behavioral features in the specific datasets, and considering values' co-occurrence relationship compensates the information lost from evaluating each value in the same claims individually. Experimental results on three real-world datasets demonstrate the effectiveness of our approach.
Madi, BMA, Sheng, QZ, Yao, L, Qin, Y & Wang, X 2016, 'PLMwsp: Probabilistic latent model for web service QoS prediction', Proceedings - 2016 IEEE International Conference on Web Services, ICWS 2016, pp. 623-630.View/Download from: Publisher's site
© 2016 IEEE. With the unprecedented and dramatic development of Web services in recent years, designing novel approaches for efficient Web service prediction has become of paramount importance. Quality of Service (QoS) plays a critical role in Web service recommendation. However determining QoS values of Web services is still a challenging task. For example, some QoS properties (e.g., response time, throughput) may hold different values for different users. In this paper, we describe how to develop a novel approach, PLMwsp, based on a probabilistic latent model, to predict effectively the QoS values of Web services. A Web service prediction has been developed, and experiments have been conducted to show the efficacy of our approach.
Yao, L, Benatallah, B, Wang, X, Tran, NK & Lu, Q 2016, 'Context as a service: Realizing internet of things-aware processes for the independent living of the elderly', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 763-779.View/Download from: Publisher's site
© Springer International Publishing Switzerland 2016. The Internet of Things (IoT) embodies the evolution from systems that link digital documents to systems that relate digital information with real-world physical items. It provides the infrastructure to transparently and seamlessly glue heterogeneous resources and services together by accessing sensors and actuators over the Internet. By connecting the physical world and the digital world, IoT creates numerous novel opportunities for many applications such as smart homes, smart cities, and industrial automation. However, on the other hand, IoT poses challenges to business process development, which unfortunately, have rarely been studied in the literature. In this paper, we present WITSCare, a research prototype of Web-based Internet of Things Smart home systems, with the aims of helping older people live in their own homes independently longer and safer. WITSCare exploits the heterogeneous contextual information (e.g., daily activities) captured and learned from IoT devices, then exposes the contexts as services to be integrated into personalized care management processes, and to support automatic and better decision making in an effective and user-friendly manner. The practical experiences gained from this project provide insights on developing real-world IoT applications.
Wang, X, Zhang, Y, Zhang, W & Lin, X 2016, 'Distance-aware influence maximization in geo-social network', 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016, IEEE International Conference on Data Engineering, IEEE, Helsinki, Finland, pp. 1-12.View/Download from: Publisher's site
© 2016 IEEE.Influence maximization is a key problem in viral marketing. Given a social network G and a positive integer k, it aims to identify a seed set of k nodes in G that can maximize the expected influence spread in a certain propagation model. With the proliferation of geo-social networks, location-aware product promotion is becoming more necessary in real applications. However, the importance of the distance between users and the promoted location is underestimated in existing models. For instance, when opening a restaurant in downtown, through online promotion, the owner may expect to influence more customers who are close to the restaurant, instead of people that are far away from it. In this paper, we formally define the distance-aware influence maximization problem, to find a seed set that maximizes the expected influence over users who are more likely to be the potential customers of the promoted location. To efficiently calculate the influence spread, we adopt the maximum influence arborescence (MIA) model for influence approximation. To speed up the search, we propose three pruning strategies to prune unpromising nodes from expensive evaluation and achieve potential early termination in each iteration without sacrificing the final result's approximation ratio. In addition, novel index structures are developed to compute the bounds used in the three pruning strategies. By integrating these pruning strategies, we propose a priority based algorithm which searches users based on their order of influence. The algorithm achieves an approximation ratio of 1 - 1 over e under the MIA model. In the final, comprehensive experiments over real datasets demonstrate the efficiency and effectiveness of the proposed algorithms and pruning strategies.
Wang, X, Sheng, QZ, Fang, XS, Li, X, Xu, X & Yao, L 2015, 'Approximate truth discovery via problem scale reduction', International Conference on Information and Knowledge Management, Proceedings, pp. 503-512.View/Download from: Publisher's site
© 2015 ACM. Many real-world applications rely on multiple data sources to provide information on their interested items. Due to the noises and uncertainty in data, given a specific item, the information from different sources may conflict. To make reliable decisions based on these data, it is important to identify the trustworthy information by resolving these conflicts, i.e., the truth discovery problem. Current solutions to this problem detect the veracity of each value jointly with the reliability of each source for every data item. In this way, the efficiency of truth discovery is strictly confined by the problem scale, which in turn limits truth discovery algorithms from being applicable on a large scale. To address this issue, we propose an approximate truth discovery approach, which divides sources and values into groups according to a user-specified approximation criterion. The groups are then used for efficient inter-value influence computation to improve the accuracy. Our approach is applicable to most existing truth discovery algorithms. Experiments on real-world datasets show that our approach improves the efficiency compared to existing algorithms while achieving similar or even better accuracy. The scalability is further demonstrated by experiments on large synthetic datasets.
Kotamarthi, K, Wang, X, Grossmann, G, Sheng, QZ & Indrakanti, S 2015, 'A Framework towards model driven business process compliance and monitoring', Proceedings of the 2015 IEEE 19th International Enterprise Distributed Object Computing Conference Workshops and Demonstrations, EDOCW 2015, pp. 24-32.View/Download from: Publisher's site
© 2015 IEEE. Currently available business process monitoring solutions usually rely on applications' business logic, which hampers the separation between the business logic and the business rules. Existing solutions address the issue from only the modeling perspective at design-time. However, modern business process systems are required to be adaptive to the constant changes of the business rules at runtime. In this paper, we propose a comprehensive Model-Driven Business Process Compliance and Monitoring (MDBPCM) framework that allows for (1) Modeling and monitoring of functional and nonfunctional requirements of business process, (2) Compliance validation at both design and runtime, (3) Dynamic adaption of business rules, and (4) Separation of business rules and business logic. Our framework has been successfully implemented to support the design, execution and monitoring of BPEL processes.
Susie Fang, X, Wang, X & Sheng, QZ 2015, 'Ontology augmentation via attribute extraction from multiple types of sources', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 16-27.View/Download from: Publisher's site
© Springer International Publishing Switzerland 2015. A comprehensive ontology can ease the discovery, maintenance and popularization of knowledge in many domains. As a means to enhance existing ontologies, attribute extraction has attracted tremendous research attentions. However, most existing attribute extraction techniques focus on exploring a single type of sources, such as structured (e.g., relational databases), semi-structured (e.g., Extensible Markup Language (XML)) or unstructured sources (e.g., Web texts, images), which leads to the poor coverage of knowledge bases (KBs). This paper presents a framework for ontology augmentation by extracting attributes from four types of sources, namely existing knowledge bases (KBs), query stream, Web texts, and Document Object Model (DOM) trees. In particular, we use query stream and two major KBs, DBpedia and Freebase, to seed the attribute extraction from Web texts and DOM trees. We specially focus on exploring the extraction technique from DOM trees, which is rarely studied in previous works. Algorithms and a series of filters are developed. Experiments show the capability of our approach in augmenting existing KB ontology.
Wang, X, Sheng, QZ, Fang, XS, Yao, L, Xu, X & Li, X 2015, 'An integrated Bayesian approach for effective multi-truth discovery', International Conference on Information and Knowledge Management, Proceedings, pp. 493-502.View/Download from: Publisher's site
© 2015 ACM. Truth-finding is the fundamental technique for corroborating reports from multiple sources in both data integration and collective intelligent applications. Traditional truth-finding methods assume a single true value for each data item and therefore cannot deal will multiple true values (i.e., the multi-truth-finding problem). So far, the existing approaches handle the multi-truth-finding problem in the same way as the single-truth-finding problems. Unfortunately, the multi-truth-finding problem has its unique features, such as the involvement of sets of values in claims, different implications of inter-value mutual exclusion, and larger source profiles. Considering these features could provide new opportunities for obtaining more accurate truth-finding results. Based on this insight, we propose an integrated Bayesian approach to the multi-truth-finding problem, by taking these features into account. To improve the truth-finding efficiency, we reformulate the multi-truth-finding problem model based on the mappings between sources and (sets of) values. New mutual exclusive relations are defined to reflect the possible co-existence of multiple true values. A finer-grained copy detection method is also proposed to deal with sources with large profiles. The experimental results on three real-world datasets show the effectiveness of our approach.
Yao, L, Wang, X, Sheng, QZ, Ruan, W & Zhang, W 2015, 'Service Recommendation for Mashup Composition with Implicit Correlation Regularization', Proceedings - 2015 IEEE International Conference on Web Services, ICWS 2015, pp. 217-224.View/Download from: Publisher's site
© 2015 IEEE. In this paper, we explore service recommendation and selection in the reusable composition context. The goal is to aid developers finding the most appropriate services in their composition tasks. We specifically focus on mashups, a domain that increasingly targets people without sophisticated programming knowledge. We propose a probabilistic matrix factorization approach with implicit correlation regularization to solve this problem. In particular, we advocate that the co-invocation of services in mashups is driven by both explicit textual similarity and implicit correlation of services, and therefore develop a latent variable model to uncover the latent connections between services by analyzing their co-invocation patterns. We crawled a real dataset from Programmable Web, and extensively evaluated the effectiveness of our proposed approach.
Yao, L, Sheng, QZ, Qin, Y, Wang, X, Shemshadi, A & He, Q 2015, 'Context-aware point-of-interest recommendation using Tensor Factorization with social regularization', SIGIR 2015 - Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1007-1010.View/Download from: Publisher's site
© 2015 ACM. Point-of-Interest (POI) recommendation is a new type of recommendation task that comes along with the prevalence of locationbased social networks in recent years. Compared with traditional tasks, it focuses more on personalized, context-aware recommendation results to provide better user experience. To address this new challenge, we propose a Collaborative Filtering method based on Nonnegative Tensor Factorization, a generalization of the Matrix Factorization approach that exploits a high-order tensor instead of traditional User-Location matrix to model multi-dimensional contextual information. The factorization of this tensor leads to a compact model of the data which is specially suitable for context-aware POI recommendations. In addition, we fuse users' social relations as regularization terms of the factorization to improve the recommendation accuracy. Experimental results on real-world datasets demonstrate the effectiveness of our approach.
Wang, X, Zhuo, X, Yang, B, Meng, FJ, Jin, P, Huang, W, Young, CC, Zhang, C, Xu, JM & Montinarelli, M 2013, 'A novel service composition approach for application migration to cloud', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 667-674.View/Download from: Publisher's site
Migrating business applications to cloud can be costly, labor-intensive, and error-prone due to the complexity of business applications, the constraints of the clouds, and the limitations of existing migration techniques provided by migration service vendors. However, the emerging software-as-a-service offering model of migration services makes it possible to combine multiple migration services for a single migration task. In this paper, we propose a novel migration service composition approach to achieve a cost-effective migration solution. In particular, we first formalize the migration service composition problem into an optimization model. Then, we present an algorithm to determine the optimal composition solution for a given migration task. Finally, using synthetic trace driven simulations, we validate the effectiveness and efficiency of the proposed optimization model and algorithm. © 2013 Springer-Verlag.
Wang, X, Wang, Z & Xu, X 2013, 'An improved artificial bee colony approach to QoS-aware service selection', Proceedings - IEEE 20th International Conference on Web Services, ICWS 2013, pp. 395-402.View/Download from: Publisher's site
As available services accumulate on the Internet, QoS-aware service selection (SSP) becomes an increasingly difficult task. Since Artificial Bee Colony algorithm (ABC) has been successful in solving many problems as a simpler implementation of swarm intelligence, its application to SSP is promising. However, ABC was initially designed for numerical optimization, and its effectiveness highly depends on what we call optimality continuity property of the solution space, i.e., similar variable values (or neighboring solutions) result in similar objective values (or evaluation results). We will show that SSP does not possess such property. We further propose an approximation approach based on greedy search strategies for ABC, to overcome this problem. In this approach, neighboring solutions are generated for a composition greedily based on the neighboring services of its component services. Two algorithms with different neighborhood measures are presented based on this approach. The resulting neighborhood structure of the proposed algorithms is analogical to that of continuous functions, so that the advantages of ABC can be fully leveraged in solving SSP. Also, they are pure online algorithms which are as simple as canonical ABC. The rationale of the proposed approach is discussed and the complexity of the algorithms is analyzed. Experiments conducted against canonical ABC indicate that the proposed algorithms can achieve better optimality within limited time. © 2013 IEEE.
Traditional service composition approaches face the significant challenge of how to deal with massive individualized requirements. Such challenges include how to reach a tradeoff between one generalized solution and multiple customized ones and how to balance the costs and benefits of a composition solution(s). Service network is a feasible method to cope with these challenges by interconnecting distributed services to form a dynamic network that operates as a persistent infrastructure, and satisfies the massive individualized requirements of many customers. When a requirement arrives, the service network is dynamically customized and transformed into a specific composite solution. In such way, mass requirements are fulfilled cost-effectively. The conceptual architecture and the mechanisms of facilitating mass customization are presented in this paper, and a competency assessment framework is proposed to evaluate its mass customization and cost-effectiveness capacities. © 2013 Springer-Verlag Berlin Heidelberg.
Wang, X, Wang, Z & Xu, X 2012, 'Analytic profit optimization of service-based systems', Proceedings - 2012 IEEE 19th International Conference on Web Services, ICWS 2012, pp. 359-367.View/Download from: Publisher's site
Service computing has become a dominant paradigm enabling the building of complex service-oriented systems, with the aim of business added-value. Because these systems are inevitably based on uncontrollable services on the unpredictable Internet, it is important to find effective ways of maximizing the profit of service-oriented systems in such unreliable environment. In this paper, we propose an analytic approach that employs a build-time analysis of the runtime dynamics of service execution to maximize the net profit from delivering composite services under full probability of uncertainty. We also present methods for improving the optimization efficiency, including reusing intermediate computation results and adopting specialized profit optimization algorithms. The superiority of the proposed approach is both theoretically proved and empirically demonstrated through experiments. © 2012 IEEE.
Wang, X, Wang, Z & Xu, X 2011, 'Semi-empirical service composition: A clustering based approach', Proceedings - 2011 IEEE 9th International Conference on Web Services, ICWS 2011, pp. 219-226.View/Download from: Publisher's site
Service composition has the capability of constructing coarse-grained solutions by dynamically aggregating a set of services to satisfy complex requirements, but it suffers from dramatic decrease on the efficiency of determining the best composition solution when large scale candidate services are available. Most current approaches look for the optimal composition solution by real-time computation, and the composition efficiency greatly depends on the adopted algorithms. To eliminate such deficiency, this paper proposes a semi-empirical composition approach which incorporates two stages, i.e., periodical clustering and real-time composition. The former partitions the candidate services and historical requirements into clusters based on similarity measurement, and then the probabilistic correspondences between service clusters and requirement clusters are identified by statistical analysis. The latter deals with a new requirement by firstly finding its most similar requirement cluster and the corresponding service clusters by leveraging Bayesian inference, then a set of concrete services are optimally selected from such reduced solution space and constitute the final composition solution. Instead of relying on solely historical data exploration or on pure real-time computation, our approach distinguishes from traditional methods by combining the two perspectives together. Experiments demonstrate the advantages of this approach. © 2011 IEEE.
Wang, X, Wang, Z & Xu, X 2011, 'Price heuristics for highly efficient profit optimization of service composition', Proceedings - 2011 IEEE International Conference on Services Computing, SCC 2011, pp. 378-385.View/Download from: Publisher's site
Service composition follows a three-party paradigm, i.e., a broker mediates between service providers and service consumers to properly select and compose a set of distributed services together so that requirements raised by consumers are satisfied by the composite service on demand. As the de facto provider of composite services, the broker charges the consumers; on the other hand, it awards cost to the providers whose services are involved in the composite services. Besides traditional quality-oriented optimization from the consumers' point of view, the profit that a broker could earn from the composition is another objective to be optimized. But just as the quality optimization, service selection for profit optimization suffers from dramatic efficiency decline along with the growth in the number of candidate services. On the premise that the expected quality are guaranteed, this paper presents a "divide and select" approach for high-efficiency profit optimization, with price as heuristics. This approach can be applied to both static and dynamic pricing scenarios of service composition. Experiments demonstrate the feasibility. © 2011 IEEE.
Wang, X, Wang, Z, Xu, X, Liu, A & Chu, D 2010, 'A service composition approach for the fulfillment of temporally sequential requirements', Proceedings - 2010 6th World Congress on Services, Services-1 2010, pp. 559-565.View/Download from: Publisher's site
Traditional service composition approaches focus on selecting and composing multiple service components together to fulfill one single requirement. But in most real-world scenarios, there are multiple requirements raised by multiple consumers and they form a discrete and uneven flow (i.e., a temporal sequence). Due to the limited number of available services and their limited capacities, how to ensure the equilibrium between the satisfaction degrees of these temporally sequential requirements becomes an important issue to be addressed. This paper proposes an equilibrium-oriented service composition approach taking into account both the limitedness of service capacity and utilization of historical data. The temporal sequential requirements are divided gradually along with the formation of length-flexible time-segments one by one. Based on this segmentation, service capacity is preserved proportionally for the estimated future requirements, and multiple requirements within one segment are ensured to get relatively equal chances of being satisfied with relatively equal quality. Experiments reveal improved sustainability and superior temporal stability of service quality compared with applying traditional methods to this scenario. © 2010 IEEE.
Novel service mode namely bilateral resource integration is proposed for rapid establishment of connections between customer and appropriate service provider. Both sides of service participants are deemed as valuable resources and virtualization mechanism concerning customer requirements and service resources is put forward. Concept of virtualized service resource (VSR) is emphasized for facilitating rapid integration and scaled personalization, and a life cycle view towards VSR is given, including establishment, selection, usage, discard and management as key stages. The advantage of this solution is that service integration is done in an abstract level and in a more flexible way, with accumulative recommendation, require-resource mapping and dynamic service composition combined reasonably together, and personalized characteristics extracted from VSR based structural knowledge easily. © 2010 Springer-Verlag.
Wang, Z, Xu, X, Chu, D & Wang, X 2010, 'The bundling of multiple requirements for maximizing the utilization of service resources', Proceedings - IEEE International Conference on E-Business Engineering, ICEBE 2010, pp. 206-213.View/Download from: Publisher's site
We present a service resource selection and scheduling approach capable of maximizing the resource utilization rate (RUT) and the requirement satisfaction degree (RSD) by bundling multiple customer requirements (CRs). In traditional approaches, each CR is optimally satisfied by independently selecting a set of candidate service resources. This possibly leads to a low RUT and low RSD. In our approach, multiple CRs raised within a certain time period are bundled and a virtual service resource (VSR) is constructed to satisfy these requirements simultaneously by making full use of the sharing nature of resources. Specifically, each CR is first decomposed into a set of atomic requirements, which are then re-aggregated according to their requested resources. For four types of service-resource sharing patterns, we present the corresponding greedy algorithms that construct the VSR and its scheduling. The goals of our methods are (1) maximizing the satisfaction degree of CRs and (2) maximizing the RUT of service resources. The effectiveness of our approach is demonstrated in an experiment for a typical scenario of ocean transportation service. © 2010 IEEE.
Meng, Y, Wang, Z, Xu, X & Wang, X 2010, 'A generalized service resource management framework', Proceedings - 2010 International Conference on Service Science, ICSS 2010, pp. 329-334.View/Download from: Publisher's site
This paper presents a service resource management system. It is oriented to both single and integrated service resources and manages every stage of service resource lifecycle efficiently which includes registering, publication, usage, destruction processes. To satisfy different specific service area, the developer only needs to configure the template provided by the service resource management system, which will greatly shorten the development cycle and increase the development efficiency. The paper is organized as follows: firstly, the paper presents the concept of the service resources, and introduces how to classify and depict them. Secondly, using UML use case analyzes the requirement of the service resource management. Thirdly, the service resource management system design is put forward. At the end, it introduces how to apply the system to a specific service area briefly. © 2010 IEEE.