DOI QR코드

DOI QR Code

Sentiment Analysis and Data Visualization of U.S. Public Companies' Disclosures using BERT

BERT를 활용한 미국 기업 공시에 대한 감성 분석 및 시각화

  • 김효곤 (경상국립대학교 기술경영학과) ;
  • 유동희 (경상국립대학교 경영정보학과 및 경영경제연구소)
  • Received : 2022.04.20
  • Accepted : 2022.08.02
  • Published : 2022.09.30

Abstract

Purpose This study quantified companies' views on the COVID-19 pandemic with sentiment analysis of U.S. public companies' disclosures. It aims to provide timely insights to shareholders, investors, and consumers by analyzing and visualizing sentiment changes over time as well as similarities and differences by industry. Design/methodology/approach From more than fifty thousand Form 10-K and Form 10-Q published between 2020 and 2021, we extracted over one million texts related to the COVID-19 pandemic. Using the FinBERT language model fine-tuned in the finance domain, we conducted sentiment analysis of the texts, and we quantified and classified the data into positive, negative, and neutral. In addition, we illustrated the analysis results using various visualization techniques for easy understanding of information. Findings The analysis results indicated that U.S. public companies' overall sentiment changed over time as the COVID-19 pandemic progressed. Positive sentiment gradually increased, and negative sentiment tended to decrease over time, but there was no trend in neutral sentiment. When comparing sentiment by industry, the pattern of changes in the amount of positive and negative sentiment and time-series changes were similar in all industries, but differences among industries were shown in neutral sentiment.

Keywords

References

  1. 사공원, 하성호, 박경배, "온라인 후기에 내재된 고객의 감성분석과 LQI 차원별 호텔서비스 품질 평가," 정보시스템연구, 제25권, 제3호, 2016, pp. 217-245.
  2. 이선민, 천세진, 박상언, 이태욱, 김우주, "동적토픽 모델링과 감성 분석을 이용한 COVID-19 구간별 비대면 근무 부정요인 검출에 관한 연구," 정보시스템연구, 제30권, 제4호, 2021, pp. 277-301.
  3. 홍태호, 나우한잉, 임강, 박지영, "LDA를 이용한 온라인 리뷰의 다중 토픽별 감성분석 - TripAdvisor 사례를 중심으로," 정보시스템연구, 제27권, 제1호, 2018, pp. 89-110.
  4. Alaparthi, S., and Mishra, M., "Bidirectional Encoder Representations from Transformers (BERT): A Sentiment Analysis Odyssey," 2020, arXiv preprint arXiv:2007.01127.
  5. Ali, S. M., Gupta, N., Nayak, G. K., and Lenka, R. K., "Big Data Visualization: Tools and Challenges," In 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I), IEEE, December 2016, pp. 656-660.
  6. Araci, D. "FinBERT: Financial Sentiment Analysis with Pre-trained Language Models," 2019, arXiv preprint arXiv:1908.10063.
  7. Arslan, Y., Allix, K., Veiber, L., Lothritz, C., Bissyande, T. F., Klein, J., and Goujon, A., "A Comparison of Pre-Trained Language Models for Multi-class Text Classification in the Financial Domain," In Companion Proceedings of the Web Conference 2021, April 2021, pp. 260-268.
  8. Azimi, M., and Agrawal, A., "Is Positive Sentiment in Corporate Annual Reports Informative? Evidence from Deep Learning," The Review of Asset Pricing Studies, Vol 11, No. 4, 2021, pp. 762-805. https://doi.org/10.1093/rapstu/raab005
  9. Birjali, M., Kasri, M., and Beni-Hssane, A. "A Comprehensive Survey on Sentiment Analysis: Approaches, Challenges and Trends," Knowledge-Based Systems, Vol. 226, 2021, 107134. https://doi.org/10.1016/j.knosys.2021.107134
  10. Capuano, N., Greco, L., Ritrovato, P., and Vento, M., "Sentiment Analysis for Customer Relationship Management: An Incremental Learning Approach," Applied Intelligence, Vol. 51, No. 6, 2021, pp. 3339-3352. https://doi.org/10.1007/s10489-020-01984-x
  11. Che, S., Zhu, W., and Li, X., "Anticipating Corporate Financial Performance from CEO Letters Utilizing Sentiment Analysis," Mathematical Problems in Engineering, 2020, 2020.
  12. Chi, S., and Shanthikumar, D. M., "Do Retail Investors Use SEC Filings? Evidence from EDGAR Search," Evidence from EDGAR Search (October 25, 2018), 2018.
  13. Chouliaras, A., "The Pessimism Factor: SEC EDGAR Form 10-K Textual Analysis and Stock Returns," Available at SSRN 2627037, 2015.
  14. Devlin, J., Chang, M. W., Lee, K., and Toutanova, K., "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," arXiv preprint arXiv:1810.04805, 2018.
  15. Dhola, K., and Saradva, M., "A Comparative Evaluation of Traditional Machine Learning and Deep Learning Classification Techniques for Sentiment Analysis," In 2021 11th International Conference on Cloud Computing, Data Science and Engineering, IEEE, January 2021, pp. 932-936.
  16. Dor, L. E., Halfon, A., Gera, A., Shnarch, E., Dankin, L., Choshen, L., and Slonim, N., "Active Learning for BERT: An Empirical Study," In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), November 2020, pp. 7949-7962.
  17. Dube, K., Nhamo, G., and Chikodzi, D., "COVID-19 Pandemic and Prospects for Recovery of the Global Aviation Industry," Journal of Air Transport Management, Vol. 92, 2021, 102022. https://doi.org/10.1016/j.jairtraman.2021.102022
  18. Gandhi, P., Loughran, T., and Mcdonald, B., "Using Annual Report Sentiment as a Proxy for Financial Distress in US Banks," Journal of Behavioral Finance, Vol. 20, No 4, 2019, pp. 424-436. https://doi.org/10.1080/15427560.2019.1553176
  19. Garcia, D., "Sentiment during Recessions," The Journal of Finance, Vol. 68, No. 3, 2013, pp. 1267-1300. https://doi.org/10.1111/jofi.12027
  20. Hao, J., and Pham, V. T., "COVID-19 Disclosures and Market Uncertainty: Evidence from 10-Q Filings," Australian Accounting Review, Forthcoming, 2022.
  21. Heer, J., Bostock, M., and Ogievetsky, V., "A Tour Through the Visualization Zoo," Communications of the ACM, Vol. 53, No. 6, 2010, pp. 59-67.
  22. Huang, X., Teoh, S. H., and Zhang, Y., "Tone Management," The Accounting Review, Vol. 89, No. 3, 2014, pp. 1083-1113. https://doi.org/10.2308/accr-50684
  23. Ignatow, G., and Mihalcea, R., An Introduction to Text Mining: Research Design, Data Collection, and Analysis, Sage Publications, New York, 2017.
  24. Kang, T., Park, D. H., and Han, I., "Beyond the Numbers: The Effect of 10-K Tone on Firms' Performance Predictions Using Text Analytics," Telematics and Informatics, Vol. 35, No. 2, 2018, pp. 370-381. https://doi.org/10.1016/j.tele.2017.12.014
  25. Larcker, D. F., Lynch, B., Tayan, B., and Taylor, D. J., "The Spread of COVID-19 Disclosure," Rock Center for Corporate Governance at Stanford University Closer Look Series: Topics, Issues and Controversies in Corporate Governance No. CGRP-84, 2020.
  26. Lee, B., Park, J. H., Kwon, L., Moon, Y. H., Shin, Y., Kim, G., and Kim, H. J., "About Relationship Between Business Text Patterns and Financial Performance in Corporate Data," Journal of Open Innovation: Technology, Market, And Complexity, Vol. 4, No. 1, 2018, 3. https://doi.org/10.3390/joitmc4020003
  27. Loughran, T., and Mcdonald, B., "When is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks," The Journal of Finance, Vol. 66, No. 1, 2011, pp. 35-65. https://doi.org/10.1111/j.1540-6261.2010.01625.x
  28. Myskova, R., and Hajek, P., "Sustainability and Corporate Social Responsibility in the Text of Annual Reports-The Case of the IT Services Industry," Sustainability, Vol. 10, No. 11, 2018, 4119. https://doi.org/10.3390/su10114119
  29. Qin, X., Luo, Y., Tang, N., and Li, G., "Making Data Visualization More Efficient and Effective: a Survey," The VLDB Journal, Vol. 29, No. 1, 2020, pp. 93-117. https://doi.org/10.1007/s00778-019-00588-3
  30. Stephany, F., Stoehr, N., Darius, P., Neuhauser, L., Teutloff, O., and Braesemann, F., "The Corisk-Index: A Data-Mining Approach to Identify Industry-Specific Risk Assessments Related to COVID-19 in Real-Time," arXiv preprint arXiv:2003.12432, 2020.
  31. Wang, C. J., Tsai, M. F., Liu, T., and Chang, C. T., "Financial Sentiment Analysis for Risk Prediction," In Proceedings of The Sixth International Joint Conference on Natural Language Processing, October 2013, pp. 802-808.
  32. Wankhade, M., Rao, A. C. S., and Kulkarni, C., "A Survey on Sentiment Analysis Methods, Applications, and Challenges," Artificial Intelligence Review, 2022, pp. 1-50.
  33. Yuthas, K., Rogers, R., and Dillard, J. F., "Communicative Action and Corporate Annual Reports," Journal of Business Ethics, Vol. 41, No. 1, 2002, pp. 141-157. https://doi.org/10.1023/A:1021314626311
  34. Zhao, A., and Yu, Y., "Knowledge-Enabled BERT for Aspect-Based Sentiment Analysis," Knowledge-Based Systems, Vol. 227, 2021, 107220. https://doi.org/10.1016/j.knosys.2021.107220
  35. McKinsey Global Institute and Oxford Economics, COVID-19 Recovery in Hardest-hit Sectors Could Take More than 5 Years, McKinsey & Company, July 29, 2020, Retrieved June 10, 2022, Available: https://www.mckinsey.com/featured-insights/coronavirus-leading-through-the-crisis/charting-the-path-tothe-next-normal/covid-19-recovery-in-hardest-hit-sectors-could-take-more-than5-years.
  36. U.S. Securities and Exchange Commission, How to Read a 10-K/10-Q, Retrieved April 7, 2022a, Available: https://www.sec.gov/oiea/investor-alerts-and-bulletins/how-read-10-k10-q.
  37. U.S. Securities and Exchange Commission, Rules and Regulations for the Securities and Exchange Commission and Major Securities Laws, Retrieved April 7, 2022b, Available: https://www.sec.gov/about/laws/secrulesregs.htm.
  38. Whiting, K. and Wood, J., Two Years of COVID-19: Key Milestones in the Pandemic, World Economic Forum, December 2021, Retrieved April 7, 2022, Available: https://www.weforum.org/agenda/2021/12/covid19-coronavirus-pandemic-two-years-milestones/.