Raza, M, Hussain, FK, Hussain, OK, Zhao, M & Rehman, ZU 2019, 'A comparative analysis of machine learning models for quality pillar assessment of SaaS services by multi-class text classification of users' reviews', Future Generation Computer Systems, vol. 101, pp. 341-371.View/Download from: Publisher's site
© 2019 Elsevier B.V. Software as a Service (SaaS) has emerged as the most widely used of all the current software delivery models. With the growth of edge computing, as SaaS services increasingly become distributed, selecting the best SaaS provider from those available is challenging but it is of critical importance. In the recent past, well-known cloud service providers such as Amazon Web Services and Microsoft have developed frameworks and service quality pillars for cloud applications. However, there are currently no mechanisms for product users to know if and to what extent a service satisfies the defined service pillar. Having such information would enable users to form trustworthy associations in edge computing. In this paper, we address this drawback by adopting a systematic approach of analysing customer reviews related to SaaS products and ascertain to which service quality pillar they refer. We use eleven traditional machine learning classification approaches and a weighted voting ensemble of these classifiers to achieve this task and test the performance of each of them. Since the dataset is unbalanced in terms of sample distribution per class, we use 10-fold cross-validation on the training dataset to determine the best parameters for each machine learning algorithm to achieve optimal performance. Friedman test and Nemenyi's post hoc test is applied to identify the significant difference among the classifiers performance during cross-validation. Based on the experimental results, a comparative analysis is conducted to identify the best performing machine learning classification model on the SaaS reviews. The results show that the performance of the logistic regression model has a higher performance among the individual classifiers and the weighted voting ensemble shows minimal improvement in overall performance.