DOI QR코드

DOI QR Code

Development of a Deep Learning Model for Detecting Fake Reviews Using Author Linguistic Features

작성자 언어적 특성 기반 가짜 리뷰 탐지 딥러닝 모델 개발

  • Received : 2022.07.29
  • Accepted : 2022.11.11
  • Published : 2022.12.31

Abstract

Purpose This study aims to propose a deep learning-based fake review detection model by combining authors' linguistic features and semantic information of reviews. Design/methodology/approach This study used 358,071 review data of Yelp to develop fake review detection model. We employed linguistic inquiry and word count (LIWC) to extract 24 linguistic features of authors. Then we used deep learning architectures such as multilayer perceptron(MLP), long short-term memory(LSTM) and transformer to learn linguistic features and semantic features for fake review detection. Findings The results of our study show that detection models using both linguistic and semantic features outperformed other models using single type of features. In addition, this study confirmed that differences in linguistic features between fake reviewer and authentic reviewer are significant. That is, we found that linguistic features complement semantic information of reviews and further enhance predictive power of fake detection model.

Keywords

References

  1. 강성안, 김동연, 류민호. "텍스트 마이닝을 이용한 부동산 서비스 앱 리뷰 분석," 정보시스템연구, 제 30권, 4호, 2021, pp. 227-245.
  2. 강지우, 김동욱, 송이현, 이석범, 이범진, 정윤경, "음식점 가짜 리뷰 판별을 위한 기계학습 방법 비교," 한국정보과학회 학술발표논문집, 2017, pp. 1980-1982.
  3. 박지현, 김종권, "국내 쇼핑 사이트 적용을 위한 리뷰 스팸 탐지 방법의 성능 평가," 정보과학회논문지, 제44권, 4호, 2017, pp. 339-343.
  4. 박현정, 송민채, 신경식. "CNN 을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로." 지능정보연구 제24권, 2호, 2018, pp. 59-83. https://doi.org/10.13088/JIIS.2018.24.2.059
  5. 사공원, 하성호, 박경배, "온라인 후기에 내재된 고객의 감성분석과 LQI 차원별 호텔서비스 품질 평가," 정보시스템연구, 제25권, 3호, 2016, pp. 217-245.
  6. 야오즈옌, 박영기, 홍태호, "리뷰어의 속성이 온라인 리뷰 유용성에 미치는 영향에 관한 연구," 정보시스템연구, 제 29권, 2호, 2020, pp. 173-195
  7. 이민철, 윤현식, "머신러닝을 활용한 가짜리뷰 탐지 연구: 사용자 행동 분석을 중심으로," 지식경영연구, 제21권, 3호, 2020, pp. 177-195. https://doi.org/10.15813/KMR.2020.21.3.010
  8. 이호근, 곽현, "온라인 소비자 리뷰의 효과에 영향을 미치는 요인에 대한 고찰," 정보화정책, 제20권, 3호, 2013, pp. 3-17.
  9. Babic Rosario, A., Sotgiu, F., De Valck, K., and Bijmolt, T. H., "The Effect of Electronic Word of Mouth on Sales: A Meta-analytic Review of Platform, Product, and Metric Factors," Journal of Marketing Research, Vol. 53, No. 3, 2016, pp. 297-318. https://doi.org/10.1509/jmr.14.0380
  10. Bahdanau, D., Cho, K., and Bengio, Y, "Neural Machine Translation by Jointly Learning to Align and Translate," arXiv preprint arXiv:1409.0473, 2014, Available:https://doi.org/10.48550/arXiv.1409.0473.
  11. Ball, L., and Elworthy, J., "Fake or Real? The Computational Detection of Online Deceptive Text," Journal of Marketing Analytics, Vol. 2, No. 3, 2014, pp. 187-201.
  12. Banerjee, S., and Chua, A. Y., "Understanding the Process of Writing Fake Online Reviews," In Ninth International Conference on Digital Information Management (ICDIM), 2014, pp. 68-73.
  13. Banerjee, S., "Exaggeration in Fake vs. Authentic Online Reviews for Luxury and Budget Hotels," International Journal of Information Management, Vol. 62, 2022, 102416.
  14. Bhargava, R., Baoni, A., and Sharma, Y., "Composite Sequential Modeling for Identifying Fake Reviews," Journal of Intelligent Systems, Vol. 28, No, 3, 2019, pp. 409-422. https://doi.org/10.1515/jisys-2017-0501
  15. Connor, J. T., Martin, R. D., and Atlas, L. E.," Recurrent Neural Networks and Robust Time Series Prediction," IEEE Transactions on Neural Networks, Vol. 5, No. 2, 1994, pp. 240-254. https://doi.org/10.1109/72.279188
  16. Dellarocas, C., and Narayan, R., "What Motivates Consumers to Review a Product Online? A Study of the Product-Specific Antecedents of Online Movie Reviews," Statistical Science, C, Vol. 21, 2006, pp. 277-285.
  17. Dematis, I., Karapistoli, E., and Vakali, A., "Fake Review Detection via Exploitation of Spam Indicators and Reviewer Behavior Characteristics," In International Conference on Current Trends in Theory and Practice of Informatics, 2018, pp. 581-595.
  18. Fontanarava, J., Pasi, G., and Viviani, M., "Feature Analysis for Fake Review Detection through Supervised Classification," In 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2017, pp. 658-666.
  19. Hajek, P., Barushka, A., and Munk, M., "Fake Consumer Review Detection Using Deep Neural Networks Integrating Word Embeddings and Emotion Mining," Neural Computing and Applications, Vol. 32, No. 23, 2020, pp. 17259-17274. https://doi.org/10.1007/s00521-020-04757-2
  20. Hochreiter, S., and Schmidhuber, J., "Long Short-term Memory," Neural Computation, Vol. 9, No. 8, 1997, pp. 1735-1780.
  21. Hu, D., Wang, C., Nie, F., and Li, X., "Dense Multimodal Fusion for Hierarchically Joint Representation," IEEE International Conference on Acoustics, Speech and Signal Processing, 2019.
  22. Kc, S., and Mukherjee, A., "On the Temporal Dynamics of Opinion Spamming: Case Studies on Yelp," In Proceedings of the 25th International Conference on World Wide Web, 2016, pp. 369-379.
  23. Kooti, F., Lerman, K., Aiello, L. M., Grbovic, M., Djuric, N., and Radosavljevic, V., "Portrait of an Online Shopper: Understanding and Predicting Consumer Behavior," Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, 2016, pp. 205-214.
  24. Kudo, T., and Richardson, J., "Sentencepiece: A Simple and Language Independent Subword Tokenizer and Detokenizer for Neural Text Processing," arXiv preprint arXiv:1808.06226, 2018, Available: https://doi.org/10.48550/arXiv.1808.06226.
  25. Li, F. H., Huang, M., Yang, Y., and Zhu, X., "Learning to Identify Review Spam," In Twenty-second International Joint Conference on Artificial Intelligence, 2011.
  26. Li, L., Lee, K. Y., Lee, M., and Yang, S. B., "Unveiling the Cloak of Deviance: Linguistic Cues for Psychological Processes in Fake Online Reviews," International Journal of Hospitality Management, Vol. 87, 2020, 102468.
  27. Moon, S., Kim, M. Y., and Iacobucci, D., "Content Analysis of Fake Consumer Reviews by Survey-based Text Categorization," International Journal of Research in Marketing, Vol. 38, No. 2, 2021, pp. 343-364. https://doi.org/10.1016/j.ijresmar.2020.08.001
  28. Mukherjee, A., Venkataraman, V., Liu, B., and Glance, N., "Fake Review Detection: Classification and Analysis of Real and Pseudo Reviews," Technical Report, UIC-CS-03-2013, 2013.
  29. Ong, T., Mannino, M., and Gregg, D., "Linguistic Characteristics of Shill Reviews," Electronic Commerce Research and Applications, Vol. 13, No. 2, 2014, pp. 69-78. https://doi.org/10.1016/j.elerap.2013.10.002
  30. Ott, M., Choi, Y., Cardie, C., and Hancock, J. T., "Finding Deceptive Opinion Spam by Any Stretch of the Imagination," arXiv preprint arXiv:1107.4557, 2011, Available: https://doi.org/10.48550/arXiv.1107.4557.
  31. Peng, H. G., Zhang, H. Y., and Wang, J. Q., "Cloud Decision Support Model for Selecting Hotels on TripAdvisor. com with Probabilistic Linguistic Information," International Journal of Hospitality Management, Vol. 68, 2018, pp. 124-138.
  32. Pennebaker, J. W., Boyd, R. L., Jordan, K., and Blackburn, K., The Development and Psychometric Properties of LIWC2015, Austin, TX: University of Texas at Austin, 2015.
  33. Rayana, S., and Akoglu, L., "Collective Opinion Spam Detection: Bridging Review Networks and Metadata," In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data ,Mining, 2015, pp. 985-994.
  34. Ren, Y., and Ji, D., "Neural Networks for Deceptive Opinion Spam Detection: An Empirical Study," Information Sciences, Vol. 385, 2017, pp. 213-224. https://doi.org/10.1016/j.ins.2017.01.015
  35. Sak, H., Senior, A., and Beaufays, F., "Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition," arXiv preprint arXiv:1402.1128, 2014, Available: https://doi.org/10.48550/arXiv.1402.1128.
  36. Shukla, Aishwarya, Wang, Weiguang, Gao, Guodong (Gordon), and Agarwal, Ritu, "Catch Me If You Can - Detecting Fraudulent Online Reviews of Doctors Using Deep Learning," 2019, Available at SSRN: http://dx.doi.org/10.2139/ssrn.3320258.
  37. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I., "Attention is All You Need," Advances in Neural Information Processing Systems, Vol. 30, 2017.
  38. Wang, J., Kan, H., Meng, F., Mu, Q., Shi, G., and Xiao, X., "Fake Review Detection Based on Multiple Feature Fusion and Rolling Collaborative Training," IEEE Access, Vol. 8, 2020, pp. 182625-182639. 
  39. Wang, X., Tang, L. R., and Kim, E., "More than words: Do Emotional Content and Linguistic Style Matching Matter on Restaurant Review Helpfulness?," International Journal of Hospitality Management, Vol. 77, 2019, pp. 438-447.
  40. Yoo, K. H., and Gretzel, U., "Comparison of Deceptive and Truthful Travel Reviews," In Information and Communication Technologies in Tourism, 2009, pp. 37-47.