DOI QR코드

DOI QR Code

Developing the Automated Sentiment Learning Algorithm to Build the Korean Sentiment Lexicon for Finance

재무분야 감성사전 구축을 위한 자동화된 감성학습 알고리즘 개발

  • Su-Ji Cho (School of Business Administration, Dankook University) ;
  • Ki-Kwang Lee (School of Business Administration, Dankook University) ;
  • Cheol-Won Yang (School of Business Administration, Dankook University)
  • Received : 2022.12.15
  • Accepted : 2023.01.10
  • Published : 2023.03.31

Abstract

Recently, many studies are being conducted to extract emotion from text and verify its information power in the field of finance, along with the recent development of big data analysis technology. A number of prior studies use pre-defined sentiment dictionaries or machine learning methods to extract sentiment from the financial documents. However, both methods have the disadvantage of being labor-intensive and subjective because it requires a manual sentiment learning process. In this study, we developed a financial sentiment dictionary that automatically extracts sentiment from the body text of analyst reports by using modified Bayes rule and verified the performance of the model through a binary classification model which predicts actual stock price movements. As a result of the prediction, it was found that the proposed financial dictionary from this research has about 4% better predictive power for actual stock price movements than the representative Loughran and McDonald's (2011) financial dictionary. The sentiment extraction method proposed in this study enables efficient and objective judgment because it automatically learns the sentiment of words using both the change in target price and the cumulative abnormal returns. In addition, the dictionary can be easily updated by re-calculating conditional probabilities. The results of this study are expected to be readily expandable and applicable not only to analyst reports, but also to financial field texts such as performance reports, IR reports, press articles, and social media.

Keywords

Acknowledgement

This paper was supported by the research fund of the National Research Foundation of Korea (NRF-2019S1A5A2A03038389).

References

  1. Bannier, C., Pauls, T. and Walter, A., Content Analysis of Business Communication: Introducing a German Dictionary, Journal of Business Economics, 2019, Vol. 89, No. 1, pp. 79-123. https://doi.org/10.1007/s11573-018-0914-8
  2. Bollen J., Mao H., and Zeng X., Twitter Mood Predicts the Stock Market, Journal of Computational Science, 2011, Vol. 2, No. 1, pp. 1-8. https://doi.org/10.1016/j.jocs.2010.12.007
  3. Brockman, P., Li, X. and Price, S.M., Conference Call Tone and Stock Returns: Evidence from the Stock Exchange of Hong Kong, Asia-Pacific Journal of Financial Studies, 2017, Vol. 46, No. 5, pp. 667-685. https://doi.org/10.1111/ajfs.12186
  4. Buehlmaier, M.M. and Whited, T.M., Are Financial Constraints Priced? Evidence from Textual Analysis, The Review of Financial Studies, 2018, Vol. 31, No. 7, pp. 2693-2728. https://doi.org/10.1093/rfs/hhy007
  5. Cambria E., Schuller B., Xia Y., and Havasi, C., New Avenues in Opinion Mining and Sentiment Analysis, IEEE Intelligent Systems, 2013, Vol. 28, No. 2, pp. 15-21. https://doi.org/10.1109/MIS.2013.30
  6. Chen, H., De, P., Hu, Y.J. and Hwang, B.H., Wisdom of Crowds: The Value of Stock Opinions Transmitted Through Social Media, The Review of Financial Studies, 2014, Vol. 27, No. 5, pp. 1367-1403. https://doi.org/10.1093/rfs/hhu001
  7. Cho, S.J., Kim, H.K. and Lee, K.K., Optimization of Investment Decision Making by Using Analysts' Target Prices, Journal of Society of Korea Industrial and Systems Engineering, 2020, Vol. 43, No. 4, pp. 229-235. https://doi.org/10.11627/jkise.2020.43.4.229
  8. Cho S.J., Kim H.K. and Yang C.W. Building the Korean Sentiment Lexicon for Finance(KOSELF), Korean Journal of Financial Studies, 2021, Vol. 50, No. 2, pp. 135-170. https://doi.org/10.26845/KJFS.2021.04.50.2.135
  9. Conley, C. and Tosti-Kharas, J., Crowdsourcing Content Analysis for Managerial Research, Management Decision, 2014, Vol. 52, No. 4, pp. 675-688. https://doi.org/10.1108/MD-03-2012-0156
  10. Das S., and Chen M., Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web, Management Science, 2007, Vol. 53, No. 9, pp. 1375-1388. https://doi.org/10.1287/mnsc.1070.0704
  11. Deng, S., Mitsubuchi, T., Shioda, K., Shimada, T. and Sakurai, A., Combining Technical Analysis with Sentiment Analysis for Stock Price Prediction, In Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth International Conference, 2011, pp. 800-807.
  12. Garca, D., Sentiment During Recessions, The Journal of Finance, 2013, Vol. 68, No. 3, pp. 1267-1300. https://doi.org/10.1111/jofi.12027
  13. Guo, H., Wang, Y., Wang, B. and Ge, Y., Does Prospectus AE Affect IPO Underpricing? A Content Analysis of the Chinese Stock Market, International Review of Economics and Finance, 2022, Vol. 82, pp. 1-12. https://doi.org/10.1016/j.iref.2022.06.001
  14. Heidari M. and Felden, C., Financial Footnote Analysis: Developing a Text Mining Approach, In Proceedings of International Conference on Data Mining (DMIN), 2015, pp. 10-16.
  15. Huang, A.H., Zang, A.Y. and Zheng, R., Evidence on the Information Content of Text in Analyst Reports, Accounting Review, 2014, Vol. 89, No. 6, pp. 2151-2180. https://doi.org/10.2308/accr-50833
  16. Kim, H.S. and Kim, C.S., An Analysis for IT Proposal Evaluation Results using Big Data-based Opinion Mining, Journal of Society of Korea Industrial and Systems Engineering, 2018, Vol. 41, No. 1, pp. 1-10. https://doi.org/10.11627/jkise.2018.41.1.001
  17. Kim, Y., and Joh, S.W. Text Analysis for IPO Firms in Korea: Analysis of Korean Texts in Registration Statements via Machine Learning, Korean Journal of Financial Studies, 2019, Vol. 48, No. 2, pp. 215-235. https://doi.org/10.26845/KJFS.2019.04.48.2.215
  18. Lee, E., and Park, C.G., Does Adoption of K-IFRS Increase Upward Bias in Analysts' Earnings Forecasts?, The Korean Journal of Financial Management, 2019, Vol. 36, No. 1, pp. 179-205. https://doi.org/10.22510/kjofm.2019.36.1.007
  19. Li F., The Information Content of Forward-Looking Statements in Corporate Filings-A Naive Bayesian Machine Learning Approach, Journal of Accounting Research, 2010, Vol. 48, No. 5, pp. 1049-1102. https://doi.org/10.1111/j.1475-679X.2010.00382.x
  20. Li, F., Lundholm, R. and Minnis, M. A Measure of Competition Based on 10-K Filings, Journal of Accounting Research, 2013, Vol. 51, No. 2, pp. 399-436. https://doi.org/10.1111/j.1475-679X.2012.00472.x
  21. Liang D., Pan Y., Du Q. and Zhu L., The Information Content of Analysts' Textual Reports and Stock Returns: Evidence from China, Finance Research Letters, 2022, Vol. 46, Part. B, pp. 1-6. https://doi.org/10.1016/j.frl.2022.102817
  22. Loughran T., and McDonald B., When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks, The Journal of Finance, 2011, Vol. 66, No. 1, pp. 35-65. https://doi.org/10.1111/j.1540-6261.2010.01625.x
  23. Schumaker, R. P. and Chen, H., A Quantitative Stock Prediction System Based on Financial News, Information Processing & Management, 2009, Vol. 45, No. 5, pp. 571-583.
  24. Shmueli, G., To Explain or to Predict? Statistical Science, 2010, Vol. 25, pp. 289-310. https://doi.org/10.1214/10-STS330
  25. Tetlock, P. C., Giving Content to Investor Sentiment: The Role of Media in the Stock Market, The Journal of Finance, 2007, Vol. 62, No. 3, pp. 1139-1168. https://doi.org/10.1111/j.1540-6261.2007.01232.x
  26. Tetlock, P.C., Tsechansky, M.S. and Macskassy, S., More Than Words: Quantifying Language to Measure Firms' Fundamentals, The Journal of Finance, 2008, Vol. 63, No. 3, pp. 1437-1467. https://doi.org/10.1111/j.1540-6261.2008.01362.x
  27. Yang, C.W., Information Content of Analyst Report Title: Focusing on the Tone of Text, The Korean Journal of Financial Management, 2021, Vol. 38, No. 3, pp. 1-38.
  28. Yu, J.D. and Lee, I.S., A Prediction of Stock Price Through the Big-data Analysis, Journal of Society of Korea Industrial and Systems Engineering, 2018, Vol. 31, No. 3, pp. 154-161. https://doi.org/10.11627/jkise.2018.41.3.154