DOI QR코드

DOI QR Code

Detecting Fake News about COVID-19 Infodemic Using Deep Learning and Content Analysis

  • Olga Chernyaeva (College of Business Administration, Pusan National University) ;
  • Taeho Hong (College of Business Administration, Pusan National University) ;
  • YongHee Kim (College of Business Administration, Pusan National University) ;
  • YoungKi Park (School of Business, George Washington University) ;
  • Gang Ren (School of Business, Hefei University of Technology) ;
  • Jisoo Ock (College of Business Administration, Pusan National University)
  • Received : 2022.09.15
  • Accepted : 2022.12.08
  • Published : 2022.12.31

Abstract

With the widespread use of social media, online social platforms like Twitter have become a place of rapid dissemination of information-both accurate and inaccurate. After the COVID-19 outbreak, the overabundance of fake information and rumours on online social platforms about the COVID-19 pandemic has spread over society as quickly as the virus itself. As a result, fake news poses a significant threat to effective virus response by negatively affecting people's willingness to follow the proper public health guidelines and protocols, which makes it important to identify fake information from online platforms for the public interest. In this research, we introduce an approach to detect fake news using deep learning techniques, which outperform traditional machine learning techniques with a 93.1% accuracy. We then investigate the content differences between real and fake news by applying topic modeling and linguistic analysis. Our results show that topics on Politics and Government services are most common in fake news. In addition, we found that fake news has lower analytic and authenticity scores than real news. With the findings, we discuss important academic and practical implications of the study.

Keywords

Acknowledgement

This research was supported by 2021 BK21 FOUR Program of Pusan National University.

References

  1. Abu Arqoub, O., Abdulateef Elega, A., Efe Ozad, B., Dwikat, H., and Adedamola Oloyede, F. (2022). Mapping the scholarship of fake news research: A systematic review. Journalism Practice, 16(1), 56-86. https://doi.org/10.1080/17512786.2020.1805791
  2. Ajao, O., Bhowmik, D., and Zargari, S. (2018, July). Fake news identification on twitter with hybrid cnn and rnn models. In Proceedings of the 9th International Conference on Social Media and Society (pp. 226-230).
  3. Amato, G., Falchi, F., Gennaro, C., Massoli, F. V., Passalis, N., Tefas, A., Trivilini, A., and Vairo, C. (2019, June). Face verification and recognition for digital forensics and information security. In 2019 7th International Symposium on Digital Forensics and Security (ISDFS) (pp. 1-6). IEEE.
  4. Apuke, O. D., and Omar, B. (2021). Fake news and COVID-19: Modelling the predictors of fake news sharing among social media users. Telematics and Informatics, 56, 101475.
  5. Balmas, M. (2014). When fake news becomes real: Combined exposure to multiple news sources and political attitudes of inefficacy, alienation, and cynicism. Communication Research, 41(3), 430-454. https://doi.org/10.1177/0093650212453600
  6. Bang, Y., Ishii, E., Cahyawijaya, S., Ji, Z., and Fung, P. (2021, February). Model generalization on COVID-19 fake news detection. In International Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation (pp. 128-140). Springer, Cham.
  7. Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993-1022.
  8. Bondielli, A., and Marcelloni, F. (2019). A survey on fake news and rumour detection techniques. Information Sciences, 497, 38-55. https://doi.org/10.1016/j.ins.2019.05.035
  9. Carracedo, P., Puertas, R., and Marti, L. (2021). Research lines on the impact of the COVID-19 pandemic on business. A text mining analysis. Journal of Business Research, 132, 586-593. https://doi.org/10.1016/j.jbusres.2020.11.043
  10. Chen, L., Lyu, H., Yang, T., Wang, Y., and Luo, J. (2020). In the eyes of the beholder: Analyzing social media use of neutral and controversial terms for COVID-19. arXiv preprint arXiv:2004.10225.
  11. Chernyaeva, O., Kim, E., and Hong, T. (2021). The detection of well-known and unknown brands' products with manipulated reviews using sentiment analysis. Asia Pacific Journal of Information Systems, 31(4), 472-490. https://doi.org/10.14329/apjis.2021.31.4.472
  12. Dale, K. R., Raney, A. A., Janicke, S. H., Sanders, M. S., and Oliver, M. B. (2017). YouTube for good: A content analysis and examination of elicitors of self-transcendent media. Journal of Communication, 67(6), 897-919. https://doi.org/10.1111/jcom.12333
  13. Gandarias, J. M., Garcia-Cerezo, A. J., and Gomez-de-Gabriel, J. M. (2019). CNN-based methods for object recognition with high-resolution tactile sensors. IEEE Sensors Journal, 19(16), 6872-6882. https://doi.org/10.1109/JSEN.2019.2912968
  14. Gao, M., Li, T., and Huang, P. (2018, November). Text classification research based on improved Word2vec and CNN. In International Conference on Service-Oriented Computing (pp. 126-135). Springer, Cham.
  15. Girgis, S., Amer, E., and Gadallah, M. (2018, December). Deep learning algorithms for detecting fake news in online text. In 2018 13th International Conference on Computer Engineering and Systems (ICCES) (pp. 93-97). IEEE.
  16. Goldberg, Y. (2016). A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research, 57, 345-420. https://doi.org/10.1613/jair.4992
  17. Goldani, M. H., Safabakhsh, R., and Momtazi, S. (2021). Convolutional neural network with margin loss for fake news detection. Information Processing & Management, 58(1), 102418.
  18. Goyal, N., and Howlett, M. (2021). "Measuring the Mix" of policy responses to COVID-19: Comparative policy analysis using topic modelling. Journal of Comparative Policy Analysis: Research and Practice, 23(2), 250-261. https://doi.org/10.1080/13876988.2021.1880872
  19. Ha, L., Andreu Perez, L., and Ray, R. (2021). Mapping recent development in scholarship on fake news and misinformation, 2008 to 2017: Disciplinary contribution, topics, and impact. American Behavioral Scientist, 65(2), 290-315.
  20. Hamid, A., Shiekh, N., Said, N., Ahmad, K., Gul, A., Hassan, L., and Al-Fuqaha, A. (2020). Fake news detection in social media using graph neural networks and NLP techniques: A COVID-19 use-case. arXiv preprint arXiv:2012.07517.
  21. Holsti, O. R. (1969). Content analysis for the social sciences and humanities. Reading. MA: Addison-Wesley (content analysis).
  22. Hou, Z., Du, F., Jiang, H., Zhou, X., and Lin, L. (2020). Assessment of public attention, risk perception, emotional and behavioural responses to the COVID-19 outbreak: Social media surveillance in China. SSRN Journal.
  23. Huerta, D. T., Hawkins, J. B., Brownstein, J. S., and Hswen, Y. (2021). Exploring discussions of health and risk and public sentiment in Massachusetts during COVID-19 pandemic mandate implementation: A Twitter analysis. SSM-Population Health, 15, 100851.
  24. Huynh, T. L. (2020). The COVID-19 risk perception: A survey on socioeconomics and media attention. Economics Bulletin, 40(1), 758-764.
  25. Jang, B., Kim, I., and Kim, J. W. (2019). Word2vec convolutional neural networks for classification of news articles and tweets. PloS one, 14(8), e0220976.
  26. Kim, G., Jang, J., Lee, J., Kim, K., Yeo, W., and Kim, J. W. (2019). Text classification using parallel word-level and character-level embeddings in convolutional neural networks. Asia Pacific Journal of Information Systems, 29(4), 771-788. https://doi.org/10.14329/apjis.2019.29.4.771
  27. Lampos, V., Majumder, M. S., Yom-Tov, E., Edelstein, M., Moura, S., Hamada, Y., Rangaka, M. X., McKendry, R. A., and Cox, I. J. (2021). Tracking COVID-19 using online search. NPJ Digital Medicine, 4(1), 1-11. https://doi.org/10.1016/j.dcmed.2021.03.001
  28. McGonagle, T. (2017). "Fake news" False fears or real concerns?. Netherlands Quarterly of Human Rights, 35(4), 203-209. https://doi.org/10.1177/0924051917738685
  29. Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
  30. Mutanga, M. B., and Abayomi, A. (2022). Tweeting on COVID-19 pandemic in South Africa: LDA-based topic modelling approach. African Journal of Science, Technology, Innovation and Development, 14(1), 163-172.
  31. Nasir, J. A., Khan, O. S., and Varlamis, I. (2021). Fake news detection: A hybrid CNN-RNN based deep learning approach. International Journal of Information Management Data Insights, 1(1), 100007.
  32. Patwa, P., Sharma, S., Pykl, S., Guptha, V., Kumari, G., Akhtar, M. S., Ekbal, A., Das, A., and Chakraborty, T. (2021, February). Fighting an infodemic: Covid-19 fake news dataset. In International Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation (pp. 21-29). Springer, Cham.
  33. Pennington, J., Socher, R., and Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
  34. Pennycook, G., McPhetres, J., Zhang, Y., Lu, J. G., and Rand, D. G. (2020). Fighting COVID-19 misinformation on social media: Experimental evidence for a scalable accuracy-nudge intervention. Psychological Science, 31(7), 770-780. https://doi.org/10.1177/0956797620939054
  35. Powers, D. M. (2020). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061.
  36. Pulido, C. M., Ruiz-Eugenio, L., Redondo-Sama, G., and Villarejo-Carballido, B. (2020). A new application of social impact in social media for overcoming fake news in health. International Journal of Environmental Research and Public Health, 17(7), 2430.
  37. Rajaraman, A., and Ullman, J. D. (2011). Mining of massive datasets. Cambridge University Press.
  38. Rampersad, G., and Althiyabi, T. (2020). Fake news: Acceptance by demographics and culture on social media. Journal of Information Technology & Politics, 17(1), 1-11. https://doi.org/10.1080/19331681.2019.1686676
  39. Robertson, S. (2004). Understanding inverse document frequency: On theoretical arguments for IDF. Journal of Documentation, 60(5), 503-520. https://doi.org/10.1108/00220410410560582
  40. Russonello, G. (2020). Afraid of coronavirus? That might say something about your politics. The New York Times. Retrieved from https://www.nytimes.com/2020/03/13/us/politics/coronavirus-trump-polling.html?smid=url-share
  41. Sharma, K., Qian, F., Jiang, H., Ruchansky, N., Zhang, M., and Liu, Y. (2019). Combating fake news: A survey on identification and mitigation techniques. ACM Transactions on Intelligent Systems and Technology (TIST), 10(3), 1-42.
  42. Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., and Lichtendahl Jr, K. C. (2017). Data mining for business analytics: Concepts, techniques, and applications in R. John Wiley & Sons Inc.
  43. Shu, K., Sliva, A., Wang, S., Tang, J., and Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22-36. https://doi.org/10.1145/3137597.3137600
  44. Silva, A., Han, Y., Luo, L., Karunasekera, S., and Leckie, C. (2021). Propagation2Vec: Embedding partial propagation networks for explainable fake news early detection. Information Processing & Management, 58(5), 102618.
  45. Song, C., Shu, K., and Wu, B. (2021). Temporally evolving graph neural network for fake news detection. Information Processing & Management, 58(6), 102712.
  46. Stemler, S. (2000). An overview of content analysis. Practical Assessment, Research, and Evaluation, 7(1), 17.
  47. Tzeng, F. Y., and Ma, K. L. (2005). Opening the black box-data driven visualization of neural networks (pp. 383-390). IEEE.
  48. Wang, Y., McKee, M., Torbica, A., and Stuckler, D. (2019). Systematic literature review on the spread of health-related misinformation on social media. Social Science & Medicine, 240, 112552.
  49. Zarocostas, J. (2020). How to fight an infodemic. The lancet, 395(10225), 676.
  50. Zhang, C., Gupta, A., Kauten, C., Deokar, A. V., and Qin, X. (2019). Detecting fake news for reducing misinformation risks using analytics approaches. European Journal of Operational Research, 279(3), 1036-1052. https://doi.org/10.1016/j.ejor.2019.06.022
  51. Zhang, X., and Ghorbani, A. A. (2020). An overview of online fake news: Characterization, detection, & discussion. Information Processing & Management, 57(2), 102025.
  52. Zhou, X., and Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR), 53(5), 1-40.