DOI QR코드

DOI QR Code

Topic Extraction and Classification Method Based on Comment Sets

  • Tan, Xiaodong (School of Mathematics and Computer Science, Hezhou University)
  • 투고 : 2018.12.06
  • 심사 : 2019.04.22
  • 발행 : 2020.04.30

초록

In recent years, emotional text classification is one of the essential research contents in the field of natural language processing. It has been widely used in the sentiment analysis of commodities like hotels, and other commentary corpus. This paper proposes an improved W-LDA (weighted latent Dirichlet allocation) topic model to improve the shortcomings of traditional LDA topic models. In the process of the topic of word sampling and its word distribution expectation calculation of the Gibbs of the W-LDA topic model. An average weighted value is adopted to avoid topic-related words from being submerged by high-frequency words, to improve the distinction of the topic. It further integrates the highest classification of the algorithm of support vector machine based on the extracted high-quality document-topic distribution and topic-word vectors. Finally, an efficient integration method is constructed for the analysis and extraction of emotional words, topic distribution calculations, and sentiment classification. Through tests on real teaching evaluation data and test set of public comment set, the results show that the method proposed in the paper has distinct advantages compared with other two typical algorithms in terms of subject differentiation, classification precision, and F1-measure.

키워드

참고문헌

  1. L. Zhuang and D. Ye, "Text sentiment classification based on CSLSTM neural network," Computer Systems & Applications, vol. 27, no. 2, pp. 230-235, 2018.
  2. H. Zhou and Y. Wu, "Extracting and clustering features of evaluation object in Chinese user reviews," Microcomputer & its Applications, vol. 2014, no. 7, pp. 72-75, 2014.
  3. Z. Hai, K. Chang, and J. J. Kim, "Implicit feature identification via co-occurrence association rule mining," in Computational Linguistics and Intelligent Text Processing. Heidelberg: Springer, 2011, pp. 393-404.
  4. W. Wang, H. Xu, and W. Wan, "Implicit feature identification via hybrid association rule mining," Expert Systems with Applications, vol. 40, no. 9, pp. 3518-3531, 2013. https://doi.org/10.1016/j.eswa.2012.12.060
  5. T. C. Chinsha and S. Joseph, "A syntactic approach for aspect based opinion mining," in Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015), Anaheim, CA, 2015, pp. 24-31.
  6. Q. Zhang and X. Liu, "Research on text emotion classification based on deep belief networks," Journal of Northwestern Polytechnical University (Social Science Edition), vol. 36, no. 1, pp. 62-66, 2016.
  7. D. Tang, B. Qin, and T. Liu, "Document modeling with gated recurrent neural network for sentiment classification," in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1422-1432.
  8. X. Tang, J. Zhu, and F. Yang, "Research on sentiment classification of online reviews based on emotional ontology and kNN algorithm," Information Studies: Theory & Application, vol. 39, no. 6, pp. 110-114, 2016.
  9. H. Liu, Z. Zhao, B, Qin, T. Liu, "Comment target extraction and sentiment classification," Journal of Chinese Information Processing, vol. 2010, no. 1, pp. 84-88, 2010.
  10. P. Yin and H. Wang, "Sentiment classification for Chinese online reviews at product feature level through domain ontology method," Journal of Systems & Management, vol. 25, no. 1, pp. 103-114, 2016.
  11. Y. X. He, S. T. Sung, F. F. Niu, and F. Li, "A deep learning model enhanced with emotion semantics for microblog sentiment analysis," Chinese Journal of Computers, vol. 40, no. 4, pp. 773-790, 2017.
  12. Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
  13. D. Blei, L. Carin, and D. Dunson, "Probabilistic topic models," IEEE Signal Processing Magazine, vol. 27, no. 6, pp. 55-65, 2010. https://doi.org/10.1109/MSP.2010.938079
  14. J. Jin, Y. Liu, P. Ji, and H. Liu, "Understanding big consumer opinion data for market-driven product design," International Journal of Production Research, vol. 54, no. 10, pp. 3019-3041, 2016. https://doi.org/10.1080/00207543.2016.1154208
  15. H. Xia, J. Liu, and H. Zhu, "A comparative study on key technologies of the Chinese sentiment classification preprocessing," Journal of Intelligence, vol. 30, no. 9, pp. 160-163, 2011. https://doi.org/10.3969/j.issn.1002-1965.2011.09.031
  16. K. Lan, D. Wang, S. Fong, L. Liu, K. Wong, and N. Dey, "A survey of data mining and deep learning in bioinformatics," Journal of Medical Systems, vol. 42, no. 8, article no. 139, 2018.
  17. W. B. A. Karaa and N. Dey, Mining Multimedia Documents. Boca Raton, FL: CRC Press, 2017.
  18. S. Xiao, S. Liu, F. Jiang, M. Song, and S. Cheng, "Nonlinear dynamic response of reciprocating compressor system with rub-impact fault caused by subsidence," Journal of Vibration and Control, vol. 25, no. 11, pp. 1737-1751, 2019. https://doi.org/10.1177/1077546319835281
  19. S. Li, Q. Ye, Y. Li, and R. Law, "Mining features of products from Chinese customer online reviews," Journal of Management Sciences in China, vol. 12, no. 2, pp. 142-152, 2009.