DOI QR코드

DOI QR Code

Multi-Topic Sentiment Analysis using LDA for Online Review

LDA를 이용한 온라인 리뷰의 다중 토픽별 감성분석 - TripAdvisor 사례를 중심으로 -

  • 홍태호 (부산대학교 경영학과) ;
  • 니우한잉 (부산대학교 경영학과) ;
  • 임강 (부산대학교 경영학과) ;
  • 박지영 (국민대학교 비즈니스 IT 전문대학원 BK21 플러스 사업팀)
  • Received : 2018.03.06
  • Accepted : 2018.03.23
  • Published : 2018.03.31

Abstract

Purpose There is much information in customer reviews, but finding key information in many texts is not easy. Business decision makers need a model to solve this problem. In this study we propose a multi-topic sentiment analysis approach using Latent Dirichlet Allocation (LDA) for user-generated contents (UGC). Design/methodology/approach In this paper, we collected a total of 104,039 hotel reviews in seven of the world's top tourist destinations from TripAdvisor (www.tripadvisor.com) and extracted 30 topics related to the hotel from all customer reviews using the LDA model. Six major dimensions (value, cleanliness, rooms, service, location, and sleep quality) were selected from the 30 extracted topics. To analyze data, we employed R language. Findings This study contributes to propose a lexicon-based sentiment analysis approach for the keywords-embedded sentences related to the six dimensions within a review. The performance of the proposed model was evaluated by comparing the sentiment analysis results of each topic with the real attribute ratings provided by the platform. The results show its outperformance, with a high ratio of accuracy and recall. Through our proposed model, it is expected to analyze the customers' sentiments over different topics for those reviews with an absence of the detailed attribute ratings.

Keywords

References

  1. 김상겸, 장성용, "토픽모델링을 이용한 국내 산업경영공학 연구동향 분석," 한국경영공학회지, 제21권, 제3호, 2016, pp. 71-95.
  2. 김재봉, 김형중, "주가지수 방향성 예측을 위한 도메인 맞춤형 감성사전 구축방안," 한국디지털콘텐츠학회 논문지, 제18권, 제3호, 2017, pp. 585-592.
  3. 김진화, 변현수, 이승훈, "온라인 리뷰를 활용한 사용자 이해 및 서비스 가치 증대," 정보시스템연구, 제20권, 제2호, 2011, pp. 21-36.
  4. 사공원, 하성호, 박경배, "온라인 후기에 내재된 고객의 감성분석과 LQI 차원별 호텔서비스 품질 평가," 정보시스템연구, 제25권, 제3호, 2016, pp. 217-245.
  5. Blei, D. M., "Probabilistic Topic Models," Communications of the ACM, Vol. 55, No. 4, 2012, pp. 77-84. https://doi.org/10.1145/2133806.2133826
  6. Blei, D. M., Ng, A. Y., and Jordan, M. I., "Latent dirichlet allocation," Journal of machine Learning research, Vol. 3, Jan, 2003, pp. 993-1022.
  7. Bravo-Marquez, F., Mendoza, M., and Poblete, B., "Meta-level sentiment models for big social data analysis," Knowledge-Based Systems, Vol. 69, 2014. pp. 86-99. https://doi.org/10.1016/j.knosys.2014.05.016
  8. Cao, J., Xia, T., Li, J., Zhang, Y., and Tang, S., "A density-based method for adaptive LDA model selection," Neurocomputing, Vol. 72, No. 7, 2009, pp. 1775-1781. https://doi.org/10.1016/j.neucom.2008.06.011
  9. Duan, W., Yu, Y., Cao, Q., and Levy, S., "Exploring the impact of social media on hotel service performance: A sentimental analysis approach," Cornell Hospitality Quarterly, Vol. 57, Vol. 3, pp. 282-296. https://doi.org/10.1177/1938965515620483
  10. Gao, S., Li, X., Yu, Z., Qin, Y., and Zhang, Y., "Combining paper cooperative network and topic model for expert topic analysis and extraction," Neurocomputing, Vol. 257, No. 27, 2017, pp. 136-143. https://doi.org/10.1016/j.neucom.2016.12.074
  11. Gretzel, U., and Yoo, K. H., "Use and impact of online travel reviews," Information and communication technologies in tourism, 2008, pp. 35-46.
  12. Hu, M., and Liu, B., "Mining and summarizing customer reviews," In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2004, pp. 168-177.
  13. Keshavarz, H., and Abadeh, M. S., "ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs," Knowledge-Based Systems, Vol. 122, 2017, pp. 1-16. https://doi.org/10.1016/j.knosys.2017.01.028
  14. Kim, H. D., and Zhai, C., "Generating comparative summaries of contradictory opinions in text," In Proceedings of the 18th ACM conference on Information and knowledge management, 2009, pp. 385-394.
  15. Li, F., Huang, M., and Zhu, X., "Sentiment Analysis with Global Topics and Local Dependency," In AAAI, Vol. 10, July, 2010, pp. 1371-1376.
  16. Lin, C., and He, Y., "Joint sentiment/topic model for sentiment analysis," In Proceedings of the 18th ACM conference on Information and knowledge management, 2009, pp. 375-384.
  17. Litvin, S. W., Goldsmith, R. E., and Pan, B., "Electronic word-of-mouth in hospitality and tourism management," Tourism management, Vol. 29, No.3, 2008, 458-468. https://doi.org/10.1016/j.tourman.2007.05.011
  18. Liu, B., "Sentiment analysis and opinion mining," Synthesis lectures on human language technologies, Vol. 5, No. 1, 2012, pp. 1-167. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
  19. Liu, B., Web data mining: exploring hyperlinks, contents, and usage data, Springer Science and Business Media, 2007.
  20. Lu, B., Ott, M., Cardie, C., and Tsou, B. K., "Multi-aspect sentiment analysis with topic models," In Data Mining Workshops (ICDMW), IEEE 11th International Conference, 2011, pp. 81-88.
  21. Marrese-Taylor, E., Velasquez, J. D., and Bravo-Marquez, F., "A novel deterministic approach for aspect-based opinion mining in tourism products reviews," Expert Systems with Applications, Vol. 41, No. 17, 2014, pp. 7764-7775. https://doi.org/10.1016/j.eswa.2014.05.045
  22. Medhat, W., Hassan, A., and Korashy, H., "Sentiment analysis algorithms and applications: A survey," Ain Shams Engineering Journal, Vol. 5, No. 4, 2014, pp. 1093-1113. https://doi.org/10.1016/j.asej.2014.04.011
  23. Mukherjee, A., and Liu, B., "Aspect extraction through semi-supervised modeling," In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, Vol. 1, 2012, pp. 339-348.
  24. Nguyen, T. H., Shirai, K., and Velcin, J., "Sentiment analysis on social media for stock movement prediction," Expert Systems with Applications, Vol. 42, No. 24, 2015, pp. 9603-9611. https://doi.org/10.1016/j.eswa.2015.07.052
  25. Pang, B., and Lee, L., "Opinion mining and sentiment analysis," Foundations and Trends in Information Retrieval, Vol. 2, No. 1-2, 2008, pp. 1-135. https://doi.org/10.1561/1500000011
  26. Ravi, K., and Ravi, V., "A survey on opinion mining and sentiment analysis: Tasks, approaches and applications," Knowledge-Based Systems, Vol. 89, 2015, pp. 14-46. https://doi.org/10.1016/j.knosys.2015.06.015
  27. Ren, G., and Hong, T., "Investigating Online Destination Images Using a Topic-Based Sentiment Analysis Approach," Sustainability, Vol. 9, No. 10, 2017, pp. 1-19.
  28. Rhee, H. T., and Yang, S. B., "How does hotel attribute importance vary among different travelers? An exploratory case study based on a conjoint analysis," Electronic markets, Vol. 25, No. 3, 2015, pp. 211-226. https://doi.org/10.1007/s12525-014-0161-y
  29. Shi, Z., Lee, G. M., and Whinston, A. B., "Toward a better measure of business proximity: Topic modeling for industry intelligence," MIS Quarterly, Vol. 40, No. 4, 2015, pp. 1035-1056.
  30. Song, M., and Kim, S. Y., "Detecting the knowledge structure of bioinformatics by mining full-text collections," Scientometrics, Vol. 96, No. 1, 2013, 183-201. https://doi.org/10.1007/s11192-012-0900-9
  31. Stringam, B. B., and Gerdes, J. Jr., "An analysis of word-of-mouse ratings and guest comments of online hotel distribution sites," Journal of Hospitality Marketing and Management, Vol. 19, No. 7, 2010, pp. 773-796. https://doi.org/10.1080/19368623.2010.508009
  32. Su, Q., Xu, X., Guo, H., Guo, Z., Wu, X., Zhang, X., and Su, Z., "Hidden sentiment association in chinese web opinion mining," In Proceedings of the 17th international conference on World Wide Web. ACM, 2008, pp. 959-968.
  33. Titov, I., and McDonald, R. T., "A Joint Model of Text and Aspect Ratings for Sentiment Summarization," ACL, Vol. 8, June, 2008, pp. 308-316.
  34. Wang, H., Zhang, D., Zhai, C., "Structural topic model for latent topical structure analysis," ACL, 2011, pp.1526-1535.
  35. Xianghua, F., Guo, L., Yanyan, G., and Zhiqiang, W., "Multi-aspect sentiment analysis for Chinese online social reviews based on topic modeling and HowNet lexicon," Knowledge-Based Systems, Vol. 37, 2013, pp. 186-195. https://doi.org/10.1016/j.knosys.2012.08.003
  36. Yu, H., Hatzivassiloglou, V., "Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences," In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2003.
  37. Zhao, W., Chen, J. J., Perkins, R., Liu, Z., Ge, W., Ding, Y., and Zou, W., "A heuristic approach to determine an appropriate number of topics in topic modeling," In proceedings of the 12th Annual MCBIOS Conference, 2015.
  38. Zhu, F., and Zhang, X., "Impact of online consumer reviews on sales: The moderating role of product and consumer characteristics," Journal of marketing, Vol. 74, No. 2, 2010, pp. 133-148. https://doi.org/10.1509/jmkg.74.2.133