Ming Liu is Research Fellow of the Connected Intelligence Centre (CIC). CIC is an innovation centre for UTS, building the capacity of staff and students to gain insights from educational data science applications.
Prior to joining UTS Ming was Associate Professor of Educational Technology at Southwest University in China, a key university researching the learning theory and educational technology for the future teachers.
Ming has a background in Computer Science (B.Com&M.IT, Tasmania), and Artificial Intelligence in Education (Ph.D., Sydney). This informs his computing perspective on how to provide better support for learning, engagement and collaboration in education.
Can supervise: YES
His research interests include human language technologies in writing and reading and learning analytics especially using technology and data to enhance learning, engagement and collaboration. His research findings have appeared in IEEE Transactions on Learning Technologies, Journal of Internet and Higher Education, Educational Technology&Society, Intelligent Tutoring System and other decent educational technology journals and conferences.
Ming lectures at postgraduate and undergraduate levels on System Design & Analysis, Web Application Framework and Educational Data Mining.
Liu, M, Liu, L & Liu, L 2018, 'Group awareness increases student engagement in online collaborative writing', Internet and Higher Education, vol. 38, pp. 1-8.View/Download from: UTS OPUS or Publisher's site
© 2018 Online Collaborative Writing (OCW) tools such as Google Docs provide an efficient way for students to perform collaborative writing tasks. However, when teachers include OCW in their teaching, they often report fewer positive student engagement. This paper proposes a novel OCW tool called Cooperpad, with a group awareness functionality, which continuously gathers group members’ writing behavior, analyzes and visualize their engagement intensity for group members to compare their participation with that of others. Using direct observations, a post-test-only design with an experimental group (N = 72) and a control group (N = 48), we have examined whether access to Cooperpad's group awareness function showed more engagement in a group-writing task than students without access to the tool. Results of direct observation indicate that Cooperpad with the group awareness support increases students’ behavioral engagement, compared with a common synchronous OCW tool (without visualization support). In addition, the results show that the quality of writing in the experiment group is significantly greater than that of the control group when performing difficult tasks.
Liu, M, Rus, V & Liu, L 2018, 'Automatic Chinese Multiple Choice Question Generation Using Mixed Similarity Strategy', IEEE Transactions on Learning Technologies, vol. 11, no. 2, pp. 193-202.View/Download from: UTS OPUS or Publisher's site
© 2008-2011 IEEE. Automatic question generation can help teachers to save the time necessary for constructing examination papers. Several approaches were proposed to automatically generate multiple-choice questions for vocalbuary assessment or grammar exercises. However, most of these studies focused on generating questions in English with a certain similarity strategy. This paper presents a mixed similarity strategy which generates Chinese multiple choice distractors with a statistical regression model including orthographic, phonological and semantic features, i.e., features that were shown in previous psycholinguistics studies to contribute to character recognition. In a first experiment, we evaluated the predictive power of the proposed features in measuring Chinese character similarity. One of the significant experimental results showed that the combination of the four proposed categories of features (structure, semantic radical, stroke and meaning) accounts for 62.5 percent of the variance in the human judgments of character similarity. In the second experiment, a user study was conducted to evaluate the quality of system-generated questions using a test item analysis method. Two hundred ninety-six Chinese primary school students (10-11-year-old) participated in this study. We have compared the mixed strategy with another three common distractor generation strategies, orthographic strategy, semantic strategy, and phonological strategy. One of important findings suggested that the mixed strategy significantly outperformed other three strategies in terms of the distractor usefulness and has a highest discrimination power among four strategies.
© 2018 - IOS Press and the authors. All rights reserved. Automatically identifying Chinese characters that are similar in their glyph, pronunciations and meaning are important for building smart question generation tools in a computer-assisted language-learning environment. Previous research on the Chinese character similarity measurement focused on character glyph (e.g. structures, strokes and radicals) with heuristic algorithms whose parameter have preset values. This article presents a machine learning (regression) approach to measure the similarity between two Chinese characters, based on the information which not only includes the glyph, but also pronunciation (pinyin) and semantic meaning derived from HowNet. We evaluated various regression models using a testing set consisting of 2586 pairs of characters selected from elementary Chinese textbooks used. The study results showed that four regression models (M5, Support Vector Machine, Gaussian Process and Linear Regression) have similar results (0.617 -1/2 Mean Absolute Error -1/2 0.641, 0.772 - 1/2 Root Mean Square Error 1/2 0.790). In addition, the study implied that the performance of the regression model could be influenced by the character frequency. Moreover, we evaluated the regression model in a well-known Chinese language learning resource, called 100 pairs of the most confusing Chinese characters. The experiment results indicated that this approach has potential in the recognition and generation of confusing Chinese character pairs.
Liu, L, Wang, S, Su, G, Huang, ZG & Liu, M 2017, 'Towards complex activity recognition using a Bayesian network-based probabilistic generative framework', Pattern Recognition, vol. 68, pp. 295-309.View/Download from: UTS OPUS or Publisher's site
© 2017 Elsevier Ltd Complex activity recognition is challenging since a complex activity can be performed in different ways, with each having its own configuration of primitive events and their temporal dependencies. To address such temporal relational variabilities in complex activity recognition, we propose a Bayesian network-based probabilistic generative framework that employs Allen's interval relation network to represent local temporal dependencies in a generative way. By employing the Chinese restaurant process and introducing relation generation constraints, our framework can characterize these unique internal configurations of a particular complex activity as a joint distribution. Three concrete models are implemented based on our framework. Specifically, in this paper we improve two of our previous models and provide an enhanced model to handle temporal relational variabilities in complex activities more efficiently. Empirical evaluations on three benchmark datasets demonstrate the competitiveness of our framework. In particular, it is shown that our models are rather robust against errors caused by the low-level predictions from raw signals.
Liu, M, Li, Y, Xu, W & Liu, L 2017, 'Automated essay feedback generation and its impact on revision', IEEE Transactions on Learning Technologies, vol. 10, no. 4, pp. 502-513.View/Download from: Publisher's site
© 2016 IEEE. Writing an essay is a very important skill for students to master, but a difficult task for them to overcome. It is particularly true for English as Second Language (ESL) students in China. It would be very useful if students could receive timely and effective feedback about their writing. Automatic essay feedback generation is a challenging task, which requires understanding the relationship between the text features of the essay and feedback. In this study, we first analyzed 1,290 teacher comments on their 327 Englishmajor students and annotated the feedback on seven aspects of writing, including the grammar, spelling, sentence diversity, structure, organization, supporting ideas, coherence, and conclusion, for each paper. Then, an automatic feedback classification experiment was conducted with the machine learning approach. Finally, we investigated the impact of the system generated-indirect corrective feedback (ICF) and human teachers' direct corrective feedback (DCF) in two English writing classes (N = 56 in ICF class; N = 54 in DCF class) at a key Chinese university through a web-based assignment management system. The study results indicated the feasibility of this approach that system generated ICF can be as useful as direct comments made by the teachers in terms of improving the quality of the content regarding to the structure, organization, supporting ideas, coherence, and conclusion, and encouraging students to spend more time on self-correction.
Liu, M, Pardo, A & Liu, L 2017, 'Using learning analytics to support engagement in collaborative writing', International Journal of Distance Education Technologies, vol. 15, no. 4, pp. 79-99.View/Download from: UTS OPUS or Publisher's site
© 2017, IGI Global. Online collaborative writing tools provide an efficient way to complete a writing task. However, existing tools only focus on technological affordances and ignore the importance of social affordances in a collaborative learning environment. This article describes a learning analytic system that analyzes writing behaviors, and creates visualizations incorporating individual engagement awareness and group ranking awareness (social affordance), and review writing behaviour history (technological affordance), to support student engagement. Studies examined the performance of the system used by university students in two collaborative writing activities: Collaboratively writing a project proposal (N = 41) and writing tutorial discussion answers (N = 25). Results show that students agreed with what the visualization conveys and visualizations enhance their engagement in a collaborative writing activity. In addition, students stated that the visualizations were useful to help them reflect on the writing process and support the assessment of individual contributions.
© 2017 IEEE. Question generation is an emerging research area of artificial intelligence in education. Question authoring tools are important in educational technologies, e.g., intelligent tutoring systems, as well as in dialogue systems. Approaches to generate factual questions, i.e., questions that have concrete answers, mainly make use of the syntactical and semantic information in a declarative sentence, which is then transformed into questions. Recently, some research has been conducted to investigate Chinese factual question generation with some limited success. Reported performance is poor due to unavoidable errors (e.g., sentence parsing, name entity recognition, and rule-based question transformation errors) and the complexity of long Chinese sentences. This article introduces a novel Chinese question generation system based on three stages, sentence simplification, question generation and ranking, to address the challenge of automatically generating factual questions in Chinese. The proposed approach and system have been evaluated on sentences from the New Practical Chinese Reader corpus. Experimental results show that ranking improves more than 20 percentage of questions rated as acceptable by annotators, from 65 percent of all questions to 87 percent of the top ranked 25 percent questions.
Liu, M, Rus, V, Liao, Q & Liu, L 2017, 'Encoding and ranking similar Chinese characters', Journal of Information Science and Engineering, vol. 33, no. 5, pp. 1195-1211.View/Download from: UTS OPUS or Publisher's site
© 2017 Institute of Information Science. All Rights Reserved. Automatically detecting similar Chinese characters is useful in many areas, such as building intelligent authoring tools (e.g. automatic multiple choice question generation) in the area of computer assisted language learning. Previous work on the computation of Chinese character similarity focused on detecting character glyph similarity while ignored the importance of other character features, such as pronunciation and meaning. In this article, we present a way to encoding 4,500 simplified Chinese characters in terms of character glyph, pronunciation and meaning, annotating similar Chinese characters and automatically ranking similar characters based on the approach of learning to rank. The experiment results indicated that this approach could be useful for ranking and recognizing similar Chinese characters in terms of glyph, pinyin and semantic meaning. Moreover, it has been found that the learning to rank Listwise (ListNet) method was more effective than Pointwise (MART) and Pairwise (RankNet).
Liu, M, Wang, Y, Xu, W & Liu, L 2017, 'Automated scoring of Chinese engineering students' english essays', International Journal of Distance Education Technologies, vol. 15, no. 1, pp. 52-68.View/Download from: UTS OPUS or Publisher's site
© 2017 IGI Global. The number of Chinese engineering students has increased greatly since 1999. Rating the quality of these students' English essays has thus become time-consuming and challenging. This paper presents a novel automatic essay scoring algorithm called PSO-SVR, based on a machine learning algorithm, Support Vector Machine for Regression (SVR), and a computational intelligence algorithm, Particle Swarm Optimization, which optimizes the parameters of SVR kernel functions. Three groups of essays, written by chemical, electrical and computer science engineering majors respectively, were used for evaluation. The study result shows that this PSO-SVR outperforms traditional essay scoring algorithms, such as multiple linear regression, support vector machine for regression and K Nearest Neighbor algorithm. It indicates that PSO-SVR is more robust in predicting irregular datasets, because the repeated use of simple content words may result in the low score of an essay, even though the system detects higher cohesion but no spelling error.
Wang, Q, Liu, L, Wang, S, Wang, JZ & Liu, M 2017, 'Predicting Beijing's tertiary industry with an improved grey model', Applied Soft Computing Journal, vol. 57, pp. 482-494.View/Download from: UTS OPUS or Publisher's site
© 2017 Elsevier B.V. In the context of the growth slowdown in China, it is important to accurately forecast the future economic trend to guide policy-makers the direction of adjusting their current economic policies. In this paper, we intend to predict Beijing's tertiary industry, whose datasets are small, irregular and non-stationary, leading to a difficulty of building an accurate prediction model. To this end, we present an improved grey model, named PRGM(1,1), which extends the grey prediction model by integrating two techniques, i.e., the particle swarm optimization algorithm for parameter optimization and the exponential preprocessing method for data cleaning. The experimental results show that PRGM(1,1) outperforms other variants of the grey prediction model in predicting Beijing's tertiary industry, and is viable to do reasonable prediction over short and fluctuated economic data sequences. In addition, we employ PRGM(1,1) in the economic prediction of Beijing's tertiary industry in the next five years, and conclude that the growth rate will decelerate. Our prediction result seems to be in line with the economic slowdown in China this year.
Liu, L, Chen, X, Liu, M, Jia, Y, Zhong, J, Gao, R & Zhao, Y 2016, 'An influence power-based clustering approach with PageRank-like model', Applied Soft Computing Journal, vol. 40, pp. 17-32.View/Download from: Publisher's site
© 2015 Elsevier B.V. In this paper, we present a clustering method called clustering by sorting influence power, which incorporates the concept of influence power as measurement among points. In our method, clustering is performed in an efficient tree-growing fashion exploiting both the hypothetical influence powers of data points and the distances among data points. Since influence powers among data points evolve over time, we adopt a PageRank-like algorithm to calculate them iteratively to avoid the issue of improper initial exemplar preference. The experimental results show that our proposed method outperforms four well-known clustering methods across seven complex and non-isotropic datasets. Moreover, our simple clustering method can be easily applied to several practical clustering problems. We evaluate the effectiveness of our algorithm on two real-world datasets, i.e. an open dataset of Alzheimers disease protein-protein interaction network and a dataset for race walking recognition collected by ourselves, and we find our method outperforms other methods reported in the literature.
Liu, L, Peng, Y, Wang, S, Liu, M & Huang, Z 2016, 'Complex activity recognition using time series pattern dictionary learned from ubiquitous sensors', Information Sciences, vol. 340-341, pp. 41-57.View/Download from: Publisher's site
© 2016 Elsevier Inc. All rights reserved. Sensor-based human activity recognition has become an important research field within pervasive and ubiquitous computing. Techniques for recognizing atomic activities such as gestures or actions are mature for now, but complex activity recognition still remains a challenging issue. In this paper, we address the problem of complex activity recognition using time series extracted from multiple sensors. We first build a dictionary of time series patterns, called shapelets, to represent atomic activities, then present three shapelet-based models to recognize sequential, concurrent, and generic complex activities. We use the datasets collected from three different labs to evaluate our shapelet-based approach and the results show that our approach can handle complex activity recognition effectively. Our experimental results also show that the shapelet-based approach outperforms other competing approaches in terms of recognition accuracy and system usage.
Liu, L, Wang, Q, Wang, J & Liu, M 2016, 'A Rolling Grey Model Optimized by Particle Swarm Optimization in Economic Prediction', Computational Intelligence, vol. 32, no. 3, pp. 391-419.View/Download from: Publisher's site
©2014 Wiley Periodicals, Inc. Grey system theory has been widely used to forecast the economic data that are often nonlinear, irregular, and nonstationary. Current forecasting models based on grey system theory could adapt to various economic time series data. However, these models ignored the importance of the model parameter optimization and the use of recent data, which lead to poor forecasting accuracy. In this article, we propose a novel forecasting model, called particle swarm optimization rolling grey model (PSO-RGM(1,1)), based on a rolling mechanism GM with optimized parameters by using the particle swarm optimization algorithm. The simple model is shown to be very effective in forecasting the tertiary industry data sequences, which are short and noisy but regular in secular trend. The experimental results show that PSO-RGM(1,1) outperforms other commonly used forecasting models on three real economic data sets. Our empirical study shows that PSO is found to be the best overall algorithm to optimize the parameter of RGM compared with other well-known metaheuristics. Furthermore, we evaluated other variant PSOs and found that single particle PSO outperforms others overall in terms of prediction accuracy, convergence speed, and degree of certainty.
Liu, L, Wang, S, Peng, Y, Huang, Z, Liu, M & Hu, B 2016, 'Mining intricate temporal rules for recognizing complex activities of daily living under uncertainty', Pattern Recognition, vol. 60, pp. 1015-1028.View/Download from: Publisher's site
© 2016 Elsevier Ltd Daily living activity recognition can be exploited to benefit mobile and ubiquitous computing applications. Techniques so far are mature to recognize simple actions. Due to the characteristics of diversity and uncertainty in daily living applications, most existing complex activity recognition approaches have notable limitations. First, graphical model-based approaches still lack sufficient expressive power to model rich temporal relations among activities. Second, it would be rather difficult for graphical model-based approaches to build a unified model for achieving multiple types of tasks. Third, current semantic-based approaches often fail to capture uncertainties. Fourth, formulae in these semantic-based approaches are often manually encoded. Meanwhile, it is impractical to handcraft each formula accurately in daily living scenarios where temporal relations among activities are intricate. To address these issues, we present a probabilistic semantic-based framework that combines Markov logic network with 15 temporal and hierarchical relations to explicitly perform diverse inference tasks of daily living in a unified manner. Advanced pattern mining techniques are introduced to automatically learn the propositional logic rules of intricate relations as well as their weights. Experimental results show that by logical reasoning with the mined temporal dependencies under uncertainty, the proposed model leads to an improved performance, particularly when recognizing complex activities involving the incomplete or incorrect observations of atomic actions.
Liu, L, Luo, D, Liu, M, Zhong, J, Wei, Y & Sun, L 2015, 'A Self-Adaptive Hidden Markov Model for Emotion Classification in Chinese Microblogs', Mathematical Problems in Engineering, vol. 2015.View/Download from: Publisher's site
© 2015 Li Liu et al. Microblogging is increasingly becoming one of the most popular online social media for people to express ideas and emotions. The amount of socially generated content from this medium is enormous. Text mining techniques have been intensively applied to discover the hidden knowledge and emotions from this huge dataset. In this paper, we propose a modified version of hidden Markov model (HMM) classifier, called self-adaptive HMM, whose parameters are optimized by Particle Swarm Optimization algorithms. Since manually labeling large-scale dataset is difficult, we also employ the entropy to decide whether a new unlabeled tweet shall be contained in the training dataset after being assigned an emotion using our HMM-based approach. In the experiment, we collected about 200,000 Chinese tweets from Sina Weibo. The results show that the F-score of our approach gets 76% on happiness and fear and 65% on anger, surprise, and sadness. In addition, the self-adaptive HMM classifier outperforms Naive Bayes and Support Vector Machine on recognition of happiness, anger, and sadness.
Liu, L, Peng, Y, Liu, M & Huang, Z 2015, 'Sensor-based human activity recognition system with a multilayered model using time series shapelets', Knowledge-Based Systems, vol. 90, pp. 138-152.View/Download from: Publisher's site
© 2015 Elsevier B.V. All rights reserved. Human activity recognition can be exploited to benefit ubiquitous applications using sensors. Current research on sensor-based activity recognition is mainly using data-driven or knowledge-driven approaches. In terms of complex activity recognition, most data-driven approaches suffer from portability, extensibility and interpretability problems, whilst knowledge-driven approaches are often weak in handling intricate temporal data. To address these issues, we exploit time series shapelets for complex human activity recognition. In this paper, we first describe the association between activity and time series transformed from sensor data. Then, we present a recursively defined multilayered activity model to represent four types of activities and employ a shapelet-based framework to recognize various activities represented in the model. A prototype system was implemented to evaluate our approach on two public datasets. We also conducted two real-world case studies for system evaluation: daily living activity recognition and basketball play activity recognition. The experimental results show that our approach is capable of handling complex activity effectively. The results are interpretable and accurate, and our approach is fast and energy-efficient in real-time.
Liu, M, Calvo, RA, Pardo, A & Martin, A 2015, 'Measuring and visualizing students' behavioral engagement in writing activities', IEEE Transactions on Learning Technologies, vol. 8, no. 2, pp. 215-224.View/Download from: Publisher's site
© 2014 IEEE. Engagement is critical to the success of learning activities such as writing, and can be promoted with appropriate feedback. Current engagement measures rely mostly on data collected by observers or self-reported by the participants. In this paper, we describe a learning analytic system called Tracer, which derives behavioral engagement measures and creates visualizations of behavioral patterns of students writing on a cloud-based application. The tool records the intermediate stages of document development and uses this data to measure learners' behavioral engagement and derive three visualizations. Writers (N = 23 University students) participated in a controlled one-hour writing session in which they post-facto self-reported their level of behavioral engagement. Results show that the level of behavioral engagement automatically estimated by the system correlates with the level reported by the participants. Additionally, users stated that the visualizations were coherent with their writing activity and were useful to help them reflect on the writing process.
© 2015 Elsevier B.V. Many existing clustering approaches are difficult to cluster non-convex or non-isotropic shapes whose centroids are not highly distinguishable. In addition, most of these approaches are often sensitive to outliers and background noise. To this end, we propose a novel clustering approach called K-PRSCAN, where PageRank algorithm is adopted to estimate the importance of data points in K clusters. The importance exhibits both intra-cluster and inter-cluster relations of a data point, enabling our method to distinguish both globular and non-globular clusters. It can also reduce the negative effect of noisy points whose importance tends to be a small value. The experimental results show that our proposed approach outperforms several well-known clustering approach across seven complex and non-isotropic datasets. We also evaluate the effectiveness of our algorithm on two real-world datasets, i.e. a public dataset of digit handwriting recognition and a dataset for race walking recognition collected by ourselves, and find our approach outperforms other existing algorithms in most aspects.
Liu, L, Wang, Q, Liu, M & Li, L 2014, 'An intelligence optimized rolling grey forecasting model fitting to small economic dataset', Abstract and Applied Analysis, vol. 2014.View/Download from: Publisher's site
Grey system theory has been widely used to forecast the economic data that are often highly nonlinear, irregular, and nonstationary. The size of these economic datasets is often very small. Many models based on grey system theory could be adapted to various economic time series data. However, some of these models did not consider the impact of recent data or the effective model parameters that can improve forecast accuracy. In this paper, we proposed the PRGM(1,1) model, a rolling mechanism based grey model optimized by the particle swarm optimization, in order to improve the forecast accuracy. The experiment shows that PRGM(1,1) gets much better forecast accuracy among other widely used grey models on three actual economic datasets. © 2014 Li Liu et al.
Liu, M, Calvo, RA & Rus, V 2014, 'Automatic generation and ranking of questions for critical review', Educational Technology and Society, vol. 17, no. 2, pp. 333-346.
Critical review skill is one important aspect of academic writing. Generic trigger questions have been widely used to support this activity. When students have a concrete topic in mind, trigger questions are less effective if they are too general. This article presents a learning-to-rank based system which automatically generates specific trigger questions from citations for critical review support. The performance of the proposed question ranking models was evaluated and the quality of generated questions is reported. Experimental results showed an accuracy of 75.8% on the top 25% ranked questions. These top ranked questions are as useful for selfreflection as questions generated by human tutors and supervisors. A qualitative analysis was also conducted using an information seeking question taxonomy in order to further analyze the questions generated by humans. The analysis revealed that explanation and association questions are the most frequent question types and that the explanation questions are considered the most valuables by student writers. © International Forum of Educational Technology & Society (IFETS).
Most of the clustering algorithms were designed to cluster the data in convex spherical sample space, but their ability was poor for clustering more complex structures. In the past few years, several spectral clustering algorithms were proposed to cluster arbitrarily shaped data in various real applications including image processing and web analysis. However, most of these algorithms were based on k-means, which is a randomized algorithm and makes the algorithm easy to fall into local optimal solutions. Hierarchical method could handle the local optimum well because it organizes data into different groups at different levels. In this paper, we propose a novel clustering algorithm called spectral clustering algorithm based on hierarchical clustering (SCHC), which combines the advantages of hierarchical clustering and spectral clustering algorithms to avoid the local optimum issues. The experiments on both synthetic data sets and real data sets show that SCHC outperforms other six popular clustering algorithms. The method is simple but is shown to be efficient in clustering both convex shaped data and arbitrarily shaped data.
Liu, L, Chen, X, Luo, D, Lu, Y, Xu, G & Liu, M 2013, 'HSC: A spectral clustering algorithm combined with hierarchical method', Neural Network World, vol. 23, no. 6, pp. 499-521.View/Download from: UTS OPUS or Publisher's site
Most of the traditional clustering algorithms are poor for clustering more complex structures other than the convex spherical sample space. In the past few years, several spectral clustering algorithms were proposed to cluster arbitrarily shaped data in various real applications. However, spectral clustering relies on the dataset where each cluster is approximately well separated to a certain extent. In the case that the cluster has an obvious inflection point within a non-convex space, the spectral clustering algorithm would mistakenly recognize one cluster to be different clusters. In this paper, we propose a novel spectral clustering algorithm called HSC combined with hierarchical method, which obviates the disadvantage of the spectral clustering by not using the misleading information of the noisy neighboring data points. The simple clustering procedure is applied to eliminate the misleading information, and thus the HSC algorithm could cluster both convex shaped data and arbitrarily shaped data more efficiently and accurately. The experiments on both synthetic data sets and real data sets show that HSC outperforms other popular clustering algorithms. Furthermore, we observed that HSC can also be used for the estimation of the number of clusters
Liu, M, Calvo, RA, Aditomo, A & Pizzato, LA 2012, 'Using Wikipedia and conceptual graph structures to generate questions for academic writing support', IEEE Transactions on Learning Technologies, vol. 5, no. 3, pp. 251-263.View/Download from: Publisher's site
In this paper, we present a novel approach for semiautomatic question generation to support academic writing. Our system first extracts key phrases from students' literature review papers. Each key phrase is matched with a Wikipedia article and classified into one of five abstract concept categories: Research Field, Technology, System, Term, and Other. Using the content of the matched Wikipedia article, the system then constructs a conceptual graph structure representation for each key phrase and the questions are then generated based the structure. To evaluate the quality of the computer generated questions, we conducted a version of the Bystander Turing test, which involved 20 research students who had written literature reviews for an IT methods course. The pedagogical values of generated questions were evaluated using a semiautomated process. The results indicate that the students had difficulty distinguishing between computer-generated and supervisor-generated questions. Computer-generated questions were also rated as being as pedagogically useful as supervisor-generated questions, and more useful than generic questions. The findings also suggest that the computer-generated questions were more useful for the first-year students than for second or third-year students. © 2008-2011 IEEE.
Liu, M & Calvo, RA 2009, 'An automatic question generation tool for supporting sourcing and integration in students' essays', ADCS 2009 - Proceedings of the Fourteenth Australasian Document Computing Symposium, pp. 90-97.
This paper presents a domain independent Automatic Question Generation (AQG) tool that generates questions which can be used as a form of support for students to revise their essay. The focus here is on generating questions based on semantic and syntactic information acquired from citations. The semantic information includes the author's name, the citation type (describing the aim of the cited study, its results or an opinion), the author's expressed sentiment, and the syntactic information of the citation. Pedagogically, the question templates are designed using Bloom's learning taxonomy where the questions reach the Analysis Level. We used 40 undergraduate students essays for our experiment and the Name Entity Recognition component is trained on 20 essays. The result of our experiment shows that the question coverage is 96% and accuracy of generated questions can reach 78%. This AQG tool will be integrated into our peer review system to scaffold feedback from peers.
Chen, M, Zhao, J & Liu, M 2019, 'Using Multiple Encoders for Chinese Neural Question Generation from the Knowledge Base', IOP Conference Series: Materials Science and Engineering.View/Download from: Publisher's site
© 2019 Institute of Physics Publishing. All rights reserved. Question generation is an important task in the field of natural language processing and intelligent tutoring system. Previous work on Chinese question generation focused on the rule-based approach, which requires a large amount of human resource to develop the question generation rules. With the recent success of deep neural network in natural language processing, especially the encoder-decoder neural network framework in machine translation, this study explored the effectiveness of the encoder-decoder network in Chinese question generation, where a triple from the knowledge base as an input is encoded and a question as the output is decoded. More importantly, the traditional encoder-decoder network is extended to have multiple encoders that can capture more diverse features to represent the triple. The study results showed that the model with multiple encoders outperformed the traditional encoder-decoder neural network by 1.78 BLEU points.
Liu, M, Shum, SB, Mantzourani, E & Lucas, C 2019, 'Evaluating machine learning approaches to classify pharmacy students’ reflective statements', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 20th International Conference, Artificial Intelligence in Education, Chicago, IL, USA, pp. 220-230.View/Download from: UTS OPUS or Publisher's site
© Springer Nature Switzerland AG 2019. Reflective writing is widely acknowledged to be one of the most effective learning activities for promoting students’ self-reflection and critical thinking. However, manually assessing and giving feedback on reflective writing is time consuming, and known to be challenging for educators. There is little work investigating the potential of automated analysis of reflective writing, and even less on machine learning approaches which offer potential advantages over rule-based approaches. This study reports progress in developing a machine learning approach for the binary classification of pharmacy students’ reflective statements about their work placements. Four common statistical classifiers were trained on a corpus of 301 statements, using emotional, cognitive and linguistic features from the Linguistic Inquiry and Word Count (LIWC) analysis, in combination with affective and rhetorical features from the Academic Writing Analytics (AWA) platform. The results showed that the Random-forest algorithm performed well (F-score = 0.799) and that AWA features, such as emotional and reflective rhetorical moves, improved performance.
Liu, L, Huang, Z, Peng, Y & Liu, M 2015, 'A hierarchical pachinko allocation model for social sentiment mining', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 299-311.View/Download from: Publisher's site
© Springer International Publishing Switzerland 2015. Existing topic models for mining sentiments from articles often ignores the fact that intra-topic correlations are common and useful to uncover a large number of fine-grained and tightly-coherent topics. This paper is concerned with the problem of social sentiment mining by modeling topic correlations. We aim to not only discover the connections between sentiments and topics, but also reveal the deeper relationship among topics where some topics may co-occur more frequently than others in articles. More specifically, we join sentiment mining with hierarchical pachinko allocation model to represent topic correlations by a hierarchy. In our model, the hierarchical pachinko allocation is employed to generate the latent hierarchical topic variables and sentiment variables. Experimental results on a collected news corpus show that our model can effectively identify latent topics in a hierarchical structure, and outperforms competing sentiment-topic models such as Latent Dirichlet Allocation based model in sentiment prediction.
Sun, L, Liu, L, Wei, Y, Zhong, J, Luo, D, Liu, M & Monkaresi, H 2014, 'Apply autocorrelation and forward difference to measure vital signs using ordinary camera', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 150-159.View/Download from: Publisher's site
Measuring heart rate by portable equipments becomes more and more popular. Current methods such as wavelet, fast fourier transform, peak detection, have been used to analyze heart rate. However, in some cases these methods are ineffective. For example, as a denoising tool, wavelet is not necessary in a few cases. One of the main challenges is determining an effective size of sliding window for heart rate detection when using peak detection. In addition, the time complexity of fast fourier transform is large which can increase the processing time that is not desirable for real-time heart rate detection systems. In this paper, we introduce autocorrelation and forward difference to count heart rate based on the features of cardiac cycle. The results show that our method is good enough so that it can be applied to non-invasive health state detection. And the time complexity of our method is satisfactory. © 2014 Springer International Publishing.
Liu, M, Calvo, RA & Pardo, A 2013, 'Tracer: A tool to measure and visualize student engagement in writing activities', Proceedings - 2013 IEEE 13th International Conference on Advanced Learning Technologies, ICALT 2013, pp. 421-425.View/Download from: Publisher's site
Learning analytic techniques are allowing the observation of complex learning activities that were hidden until now. Writing is a task in which behavioral patterns can be observed to measure the level of engagement. Previous studies relied mostly on data collected by observers. In this paper Tracer, a novel learning analytic system to visualize behavioral patterns of students while writing and measuring engagement is described. The tool combines and analyzes the information obtained from document revisions and Website logs while students work in a writing assignment and provides visualizations and measurements for the level of engagement. A user study was conducted in a software engineering course where students wrote and submitted a project proposal using Google Docs. Tracer generated a graphical view of the gauged engagement, and an engagement time for each student. The obtained results show that the engagement time gauged by Tracer was moderately correlated to those reported by the students. © 2013 IEEE.
Liu, M & Calvo, RA 2012, 'Using information extraction to generate trigger questions for academic writing support', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 358-367.View/Download from: Publisher's site
Automated question generation approaches have been proposed to support reading comprehension. However, these approaches are not suitable for supporting writing activities. We present a novel approach to generate different forms of trigger questions (directive and facilitative) aimed at supporting deep learning. Useful semantic information from Wikipedia articles is extracted and linked to the key phrases in a students' literature review, particularly focusing on extracting information containing 3 types of relations (Kind of, Similar-to and Different-to) by using syntactic pattern matching rules. We collected literature reviews from 23 Engineering research students, and evaluated the quality of 306 computer generated questions and 115 generic questions. Facilitative questions are more useful when it comes to deep learning about the topic, while directive questions are clearer and useful for improving the composition. © 2012 Springer-Verlag.
Liu, M, Calvo, RA & Rus, V 2012, 'Hybrid question generation approach for critical review writing support', Proceedings of the 20th International Conference on Computers in Education, ICCE 2012, pp. 109-111.
Research towards automated feedback can build on the work in other areas. In this paper we explore question generation techniques. Most research in question generation has focused on generating content specific questions that help students comprehend a set of documents that they must read. However, this approach is not so useful in writing activities, as students would generally understand the document that they themselves wrote. The aim of our project is to build a system which automatically generates feedback questions for academic writing support, particularly for critical review support. This paper presents our question generation system which relies on both syntax-based and template-based approaches, and uses Wikipedia as background knowledge.
Liu, L, Fan, D, Liu, M, Xu, G, Chen, S, Zhou, Y, Wang, Q, Wei, Y & Chen, X 2012, 'A MapReduce-Based Parallel Clustering Algorithm for Large Protein-Protein Interaction Networks', Lecture Notes in Computer Science, International Conference on Advanced Data Mining and Applications, Springer, Nanjing, China, pp. 138-148.View/Download from: UTS OPUS or Publisher's site
Clustering proteins or identifying functionally related proteins in Protein-Protein Interaction (PPI) networks is one of the most computation-intensive problems in the proteomic community. Most researches focused on improving the accuracy of the clustering algorithms. However, the high computation cost of these clustering algorithms, such as Girvan and Newmans clustering algorithm, has been an obstacle to their use on large-scale PPI networks. In this paper, we propose an algorithm, called Clustering-MR, to address the problem. Our solution can effectively parallelize the Girvan and Newmans clustering algorithms based on edge-betweeness using Map Reduce. We evaluated the performance of our Clustering-MR algorithm in a cloud environment with different sizes of testing datasets and different numbers of worker nodes. The experimental results show that our Clustering-MR algorithm can achieve high performance for large-scale PPI networks with more than 1000 proteins or 5000 interactions
Liu, L, Zhou, Y, Liu, M, Xu, G, Chen, X, Fan, D & Wang, Q 2012, 'Preemptive Hadoop Jobs Scheduling under a Deadline', Proceedings of Eighth International Conference on Semantics, Knowledge and Grids, International Conference on Semantics Knowledge and Grid, IEEE Computer Society, Beijing, China, pp. 72-79.View/Download from: UTS OPUS or Publisher's site
Liu, M & Calvo, RA 2011, 'Question taxonomy and implications for automatic question generation', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 504-506.View/Download from: Publisher's site
Many Automatic Question Generation (AQG) approaches have been proposed focusing on reading comprehension support; however, none of them addressed academic writing. We conducted a large-scale case study with 25 supervisors and 36 research students enroled in an Engineering Research Method course. We investigated trigger questions, as a form of feedback, produced by supervisors, and how they support these students' literature review writing. In this paper, we identified the most frequent question types according to Graesser and Person's Question Taxonomy and discussed how the human experts generate such questions from the source text. Finally, we proposed a more practical Automatic Question Generation Framework for supporting academic writing in engineering education. © 2011 Springer-Verlag Berlin Heidelberg.
Liu, M, Calvo, RA & Rus, V 2010, 'Automatic question generation for literature review writing support', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 45-54.View/Download from: Publisher's site
This paper presents a novel Automatic Question Generation (AQG) approach that generates trigger questions as a form of support for students' learning through writing. The approach first automatically extracts citations from students' compositions together with key content elements. Next, the citations are classified using a rule-based approach and questions are generated based on a set of templates and the content elements. A pilot study using the Bystander Turing Test investigated differences in writers' perception between questions generated by our AQG system and humans (Human Tutor, Lecturer, or Generic Question). It is found that the human evaluators have moderate difficulties distinguishing questions generated by the proposed system from those produced by human (F-score=0.43). Moreover, further results show that our system significantly outscores Generic Question on overall quality measures. © Springer-Verlag Berlin Heidelberg 2010.