Browse > Article
http://dx.doi.org/10.6109/jicce.2019.17.1.41

An Improved Text Classification Method for Sentiment Classification  

Wang, Guangxing (Department of Information Technology Center, Jiujiang University)
Shin, Seong Yoon (School of Computer Information & Communication Engineering, Kunsan National University)
Abstract
In recent years, sentiment analysis research has become popular. The research results of sentiment analysis have achieved remarkable results in practical applications, such as in Amazon's book recommendation system and the North American movie box office evaluation system. Analyzing big data based on user preferences and evaluations and recommending hot-selling books and hot-rated movies to users in a targeted manner greatly improve book sales and attendance rate in movies [1, 2]. However, traditional machine learning-based sentiment analysis methods such as the Classification and Regression Tree (CART), Support Vector Machine (SVM), and k-nearest neighbor classification (kNN) had performed poorly in accuracy. In this paper, an improved kNN classification method is proposed. Through the improved method and normalizing of data, the purpose of improving accuracy is achieved. Subsequently, the three classification algorithms and the improved algorithm were compared based on experimental data. Experiments show that the improved method performs best in the kNN classification method, with an accuracy rate of 11.5% and a precision rate of 20.3%.
Keywords
Sentiment Analysis; Machine Learning; Text Classification; k-Nearest Neighbor Method;
Citations & Related Records
연도 인용수 순위
  • Reference
1 H. Yigit, "A weighting approach for KNN classifier," in Proceeding of 2013 International Conference on Electronics, Computer and - Computation (ICECCO), pp. 228-231, 2013. DOI: 10.1109/ICECCO.2013.6718270.
2 B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up? Sentiment Classification using machine learning techniques," in Proceeding of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, pp. 79-86, 2002,.
3 P. D. Turney and M. L. Littman, "Measuring praiseand critism inference of semantic orientation from as sociaton," ACM Transon Information Systems, vol. 21, no. 4, pp. 315-346, 2003.   DOI
4 S. Taneja, C. Gupta, S. Aggarwal, and V. Jindal, "MFZ-KNN-A modified fuzzy based K nearest neighbor algorithm," in Proceeding of 2015 International Conference on Cognitive Computing and Information Processing (CCIP), pp. 1-5, 2015. DOI: 10.1109/CCIP.2015.7100689.
5 J. Huang, Y. Wei, J. Yi, and M. Liu, "An improved kNN based on class contribution and feature weighting," in Proceeding of 2018 10th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), pp. 313-316, 2018. DOI: 10.1109/ICMTMA.2018.00083.
6 S. Tan, Y. Li, H. Sun, Z. Guan, amd X. Yan, "Interpreting the Public Sentiment Variations on Twitter," IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 5, pp. 1158-1170, 2014. DOI: 10.1109/TKDE.2013.116.   DOI
7 B. Smith and G. Linden, "wo decades of recommender systems at amazon.com," IEEE Internet Computing, vol. 21, no. 3, pp.12-18, 2017. DOI:10.1109/MIC.2017.72.   DOI
8 S. Halder, Md. Samiullah, A. M. Jehad Sarkar, and Y.-K. Lee, "Movie swarm: Information mining technique for movie recommendation system," in Proceeding of 2012 7th International Conference on Electrical and Computer Engineering, pp. 462-465, 2013. DOI: 10.1109/ICECE.2012.6471587.
9 P. Chen and X. Fu, "Research on sentiment classification of tests based on SVM," Journal of Guangdong University of Technology, vol. 31, no. 3, pp. 95-101, 2014. DOI:10.3969/j.issn.1007-7162.2014.03.017.
10 N. Arunachalam, S. J. Sneka, and G. MadhuMathi, "A Survey on text classification techniques for sentiment polarity detection," in Proceeding of 2017 Innovations in Power and Advanced Computing Technologies (i-PACT), pp. 1-5, 2017. DOI: 10.1109/IPACT.2017.8245127.
11 J. M. Desai and S. R. Andhariya, "Sentiment analysis approach to adapt a shallow parsing based sentiment lexicon," in Proceeding of 2015 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1-4, 2015. DOI: 10.1109/ICIIECS.2015.7193160.
12 Q. Li, S. Shah, R. Fang, A. Nourbakhsh, and X. Liu, "Tweet sentiment analysis by incorporating sentiment-specific word embedding and weighted text features," in Proceeding of 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 568-571, 2016. DOI: 10.1109/WI.2016.0097.
13 R. Izmailov, V. Vapnik, and A. Vashist, "Multidimensional splines with infinite number of knots as SVM kernels," in Proceeding of the 2013 International Joint Conference on Neural Networks (IJCNN), pp.1-7, 2013. DOI: 10.1109/IJCNN.2013.6706860.
14 THUCNews DataSet, [Online] Available: http://thuctc.thunlp.org/.
15 C. Yu, "Adaptive japanese teaching optimization based on classification and regression tree," in Proceeding of 2017 International Conference on Robots & Intelligent System (ICRIS), pp.15-18, 2017. DOI: 10.1109/ICRIS.2017.12.
16 R. Li, X. Zhao, X. Yu, J. Li, N. Cheng, and J. Zhang, "Incident duration model on urban freeways using three different algorithms of decision tree," in Proceeding of 2010 International Conference on Intelligent Computation Technology and Automation, pp..526-528, 2010. DOI: 10.1109/ICICTA.2010.602.
17 L. Zhou, L. Wang, X. Ge, and Q. Shi, "A clustering-Based KNN improved algorithm CLKNN for text classification," in Proceeding of 2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010), pp. 212-215, 2010. DOI: 10.1109/CAR.2010.5456668.