Predicting numeric ratings for Google apps using text features and ensemble learning |
Umer, Muhammad
(Department of Computer Science, Khawaja Freed University)
Ashraf, Imran (Department of Information and Communication Engineering, Yeungnam Univeristy) Mehmood, Arif (Department of Computer Science and Information Technology, The Islamia University of Bahawalpur) Ullah, Saleem (Department of Computer Science, Khawaja Freed University) Choi, Gyu Sang (Department of Information and Communication Engineering, Yeungnam Univeristy) |
1 | Statista, Number of available application in the Google Play store from December 2009 to March 2019, https://www.statista.com/statistics/266210/number-of-available-applications-in-the-google-play-store/, Online: accessed 22 May 2019. |
2 | Statistaa, Number of mobile app downloads worldwide in 2017, 2018 and 2020 (in billions), https://www.statista.com/statistics/271644/worldwide-free-and-paid-mobile-app-store-downloads/, Online: accessed 22 May 2019. |
3 | J. Horrigan, Online shopping, pew internet and american life project, Washington, DC, 2018, http://www.pewinternet.org/Reports/2008/Online-Shopping/01-Summary-of-Findings.aspx Online: accessed 8 Aug. 2014. |
4 | D. Pagano and W. Maalej, User feedback in the appstore: An empirical study, in Proc. IEEE Int. Requirements Eng. Conf. (Rio de Janeiro, Brazil), July 2013, pp. 125-134. |
5 | T. Chumwatana, Using sentiment analysis technique for analyzing Thai customer satisfaction from social media, 2015. |
6 | T. Thiviya et al., Mobile apps' feature extraction based on user reviews using machine learning, 2019. |
7 | H. Hanyang et al., Studying the consistency of star ratings and reviews of popular free hybrid android and ios apps, Empirical Softw. Eng. 24 (2019), no. 7, 7-32. DOI |
8 | N. Kumari and S. Narayan Singh, Sentiment analysis on e-commerce application by using opinion mining, in Proc. Int. Conf.-Cloud Syst. Big Data Eng. (Noida, India), Jan. 2016, pp. 320-325. |
9 | R. M. Duwairi and I. Qarqaz, Arabic sentiment analysis using supervised classification, in Proc. Int. Conf. Future Internet Things Cloud (Barcelona, Spain), Aug. 2014, pp. 579-583. |
10 | H. S. Le, T. V. Le, and T. V. Pham, Aspect analysis for opinion mining of vietnamese text, in Proc. Int. Conf. Adv. Comput. Applicat. (Ho Chi Minh, Vietnam), Nov. 2015, pp. 118-123. |
11 | H. Wang, L. Yue, and C. Zhai, Latent aspect rating analysis on review text data: A rating regression approach, in Proc. ACM SIGKDD Int. Conf. Knowledge Discovery Data Mining (Washington, D.C., USA), July 2010, pp. 783-792. |
12 | K. Dave, S. Lawrence, and D. M. Pennock, Mining the peanut gallery: Opinion extraction and semantic classification of product reviews, in Proc. Int. Conf. World Wide Web (New York, USA), 2003, pp. 519-528. |
13 | A. Buche, D. Chandak, and A. Zadgaonkar, Opinion mining and analysis: A survey, arXiv preprint arXiv:1307.3336, 2013. DOI |
14 | B. Pang, L. Lee, S. Vaithyanathan, Thumbs up?: Sentiment classification using machine learning techniques, in Proc. ACL-02 Conf. Empirical Methods Natural Language Process. (Stroudsbrug, PA, USA), 2002, pp. 79-86. |
15 | C. Cardie et al., Combining low-level and summary representations of opinions for multi-perspective question answering, New directions in question answering, 2003, pp. 20-27. |
16 | H. Takamura, T. Inui, and M. Okumura, Extracting semantic orientations of words using spin model, in Proc. Annu. Meeting Association Comput. Linguistics (Ann Arbor, MI, USA), 2005, pp. 133-140. |
17 | M. Suleman, A. Malik, and S. S. Hussain, Google play store app ranking prediction using machine learning algorithm, Urdu News Headline, Text Classification by Using Different Machine Learning Algorithms, 2019. |
18 | F. Sarro et al., Customer rating reactions can be predicted purely using app features, in Proc. IEEE Int. Requirements Eng. Conf. (Banaf, Canada), Aug. 2018, pp. 76-87. |
19 | S. Aslam and I. Ashraf, Data mining algorithms and their applications in education data mining, Int. J. Adv. Res. Computer Sci. Manag. Studies 2 (2014), no. 7, 50-56. |
20 | D. Martens and T. Johann, On the emotion of users in app reviews, in Proc. IEEE/ACM Int. Workshop Emotion Awareness Softw. Eng. (Buenos Aires, Argentina), May 2017, pp. 8-14. |
21 | G. Hackeling, Mastering machine learning with scikit-learn, Packt Publishing Ltd, 2017. |
22 | Scikit learn, Scikit-learn classification and regression models, http://scikitlearn.org/stable/supervised_learning.html#supervised-learning/, Online: accessed 10 Apr. 2019 |
23 | Z. Hailong, G. Wenyan, and J. Bo, Machine learning and lexicon based methods for sentiment classification: A survey, in Proc. Web Inf. Syst. Applicat. Conf. (Tianjin, China), Sept. 2014, pp. 262-265. |
24 | O. Araque et al., Enhancing deep learning sentiment analysis with ensemble techniques in social applications, Expert Syst. Appl. 77 (2017), 236-246. DOI |
25 | J. Hartmann et al., Comparing automated text classification methods, Int. J. Res. Mark. 36 (2019), 20-38. DOI |
26 | O. Aziz et al., A comparison of accuracy of fall detection algorithms (threshold-based vs. machine learning) using waistmounted tri-axial accelerometer signals from a comprehensive set of falls and non-fall trials, Med. Biol. Eng. Comput. 55 (2017), no. 1, 45-55. DOI |
27 | L. Breiman, Random forests, Mach. Learn. 45 (2001), no. 1, 5-32. DOI |
28 | R. E. Schapire and Y. Singer, Improved boosting algorithms using confidence-rated predictions, Mach. Learn. 37 (1999), no. 3, 297-336. DOI |
29 | A. Natekin and A. Knoll, Gradient boosting machines, a tutorial, Frontiers Neurorobotics 7 (2013), 21. DOI |
30 | T. Chen and C. Guestrin, Xgboost: A scalable tree boosting system, in Proc. ACM SIGKDD Int. Conf. Knowledge Discovery Data Mining (San Francisco, CA, USA), Aug. 2016, pp. 785-794. |
31 | P. Geurts, D. Ernst, and L. Wehenkel, Extremely randomized trees, Mach. Learn. 63 (2006), no. 1, 3-42. DOI |
32 | R. Feldman and J. Sanger, The text mining handbook: Advanced approaches in analyzing unstructured data, Cambridge University Press, 2007. |
33 | B. Sriram et al., Short text classification in twitter to improve information filtering, in Proc. Int. ACM SIGIR Conf. Res. Development Inf. Retrieval (Geneva, Switzerland), July 2010, pp. 841-842. |
34 | I. Ashraf, S. Hur, and Y. Park, Blocate: A building identification scheme in gps denied environments using smartphone sensors, Sensors 18 (2018), no. 11, 3862. DOI |
35 | Scikit learn, Scikit-learn feature extraction with countvectorizer, https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.Count/, Online: accessed 5 Apr. 2019 |
36 | Scikit learn, Scikit-learn feature extraction with tf/idf, https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.Tfidf/, Online: accessed 5 Apr. 2019 |
37 | J. Han, J. Pei, and M. Kamber, Data mining: Concepts and techniques, Elsevier, 2011. |
38 | S. Loria, textblob documentation, Release 0.15 2 (2018). |
39 | P. Geurts and G. Louppe, Learning to rank with extremely randomized trees, JMLR: Workshop Conf. Proc. 14 (2011) 49-61. |
40 | X. Z. Fern and C. E. Brodley, Boosting lazy decision trees, In Proc. Int. Conf. Mach. Learn., 2003, pp. 178-185. |
41 | L. Breiman, Randomizing outputs to increase prediction accuracy, Mach. Learn. 40 (2000), no. 3, 229-242. DOI |