DOI QR코드

DOI QR Code

Stock prediction using combination of BERT sentiment Analysis and Macro economy index

  • Jang, Euna (Dept. of Industrial Engineering, Korea University) ;
  • Choi, HoeRyeon (Dept. of Industrial Engineering, Korea University) ;
  • Lee, HongChul (Dept. of Industrial Engineering, Korea University)
  • Received : 2020.05.04
  • Accepted : 2020.05.26
  • Published : 2020.05.29

Abstract

The stock index is used not only as an economic indicator for a country, but also as an indicator for investment judgment, which is why research into predicting the stock index is ongoing. The task of predicting the stock price index involves technical, basic, and psychological factors, and it is also necessary to consider complex factors for prediction accuracy. Therefore, it is necessary to study the model for predicting the stock price index by selecting and reflecting technical and auxiliary factors that affect the fluctuation of the stock price according to the stock price. Most of the existing studies related to this are forecasting studies that use news information or macroeconomic indicators that create market fluctuations, or reflect only a few combinations of indicators. In this paper, this we propose to present an effective combination of the news information sentiment analysis and various macroeconomic indicators in order to predict the US Dow Jones Index. After Crawling more than 93,000 business news from the New York Times for two years, the sentiment results analyzed using the latest natural language processing techniques BERT and NLTK, along with five macroeconomic indicators, gold prices, oil prices, and five foreign exchange rates affecting the US economy Combination was applied to the prediction algorithm LSTM, which is known to be the most suitable for combining numeric and text information. As a result of experimenting with various combinations, the combination of DJI, NLTK, BERT, OIL, GOLD, and EURUSD in the DJI index prediction yielded the smallest MSE value.

주가지수는 한 국가의 경제 지표뿐만 아니라 투자판단의 지표로도 활용되므로 이를 예측하는 연구가 지속해서 진행되고 있다. 주가지수 예측을 하는 작업은 기술적, 경제적 및 심리적 요인 등이 반영된 것으로 예측의 정확도를 위해서는 복합적 요인을 고려해야 한다. 따라서 지수의 변동에 영향을 미치는 요인들을 선별하여 반영한 주가지수 예측모델연구가 필요하다. 이와 관련한 기존 연구에서는 시장의 변동을 만들어 내는 뉴스 정보 또는 거시 경제 지표를 각각 이용하거나, 몇 가지의 지표 조합만을 반영한 예측 연구가 대부분이었다. 따라서 본 연구에서는 미국 다우존스지수 예측을 위해 뉴스 정보의 감성 분석과 다양한 거시경제지표를 고려하여 효과적인 지표 조합을 제시하고자 한다. 뉴스 정보의 감성 분석은 최신 자연어처리 기법인 BERT와 NLTK VADER를 사용하고, 예측모델은 주가예측모델로 적합하다고 알려진 딥러닝 예측모델 LSTM을 적용하여 가장 효과적인 지표 조합을 제시했다.

Keywords

References

  1. Kloptchenko, Antonina, Tomas Eklund, Jonas Karlsson, Barbro Back, Hannu Vanharanta and Ari Visa. "Combining data and text mining techniques for analyzing financial reports," Int. Syst. in Accounting, Finance and Management, Vol. 12, No. 1, pp. 29-41, March. 2004. DOI: 10.1002/isaf.239
  2. Ratto, Andrea Picasso, Simone Merello, Yukun Ma, Luca Oneto and Erik Cambria. "Technical analysis and sentiment embeddings for market trend prediction," Expert Syst. Appl. Vol. 135, pp. 60-70, Nov. 2019. DOI: 10.1016/j.eswa.2019.06.014
  3. S. K Kim and H. Y. Kim, "The Study of the Financial Index Prediction Using the Equalized Multi-layer Arithmetic Neural Network," Journal of The Korea Society of Computer and Information, Vol. 8, No. 33, pp. 113-123, 2013.
  4. Kyun Sun Eo, Kun Chang Lee. "Predicting stock price direction by using data mining methods," Journal of The Korea Society of Computer and Information, Vol. 22, No. 11, pp. 111-116. Nov. 2017. https://doi.org/10.9708/jksci.2017.22.11.111
  5. S. Deng, T. Mitsubuchi, K. Shioda, T. Shimada and A. Sakurai, "Combining Technical Analysis with Sentiment Analysis for Stock Price Prediction," 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing, pp. 800-807, Dec. 2011. DOI: 10.1109/dasc2011.138
  6. Zhai. Y. Hsu. A. and Halgamuge. S.K, "Combining news and technical indicators in daily stock price trends prediction," Advances in Neural Networks Lecture Notes in Computer Science, Vol. 4, pp. 1087-1096, June. 2007. DOI: 10.1007/978-3-540-72395-0_132
  7. Sirucek. Martin. "Macroeconomic variables and stock market: US review," International Journal of Computer Science and Management Studies. Vol. 12, No. 3, Aug. 2012.
  8. In-Kyu. Kim, " Study for Exchange rate, Interest, Stock price Using Quasi-Likelihood Estimatorfor," Journal of The Korea Society of Computer and Information , Vol. 20, No. 1, pp. 255-256, Jan. 2012. https://doi.org/10.9708/jksci.2015.20.1.255
  9. Oberndorfer. Ulrich, "Energy prices, volatility, and the stock market: Evidence from the Eurozone," Energy Policy, Vol. 37, No. 12, pp. 5787-5795, Dec. 2009. DOI: 10.1016/j.enpol.2009.08.043
  10. Malik. Farooq and Bradley T. Ewing. "Volatility transmission between oil prices and equity sector returns," International Review of Financial Analysis, Vol 18, No. 3 pp. 95-100, June. 2009. DOI: 10.1016/j.irfa.2009.03.003
  11. Chiou, Jer-Shiou and Yen-Hsien Lee. "Jump dynamics and volatility: Oil and the stock markets," Energy, Vol. 34, No. 6, pp. 788-796, June. 2009. DOI: 10.1016/j.energy.2009.02.011
  12. Angelidis, Timotheos, Stavros Degiannakis and George N. Filis. "US stock market regimes and oil price shocks," Global Finance Journal, Vol 28, pp. 132-146, Oct. 2015. DOI: 10.1016/j.gfj.2015.01.006
  13. Smith G, "The price of Gold and Stock Price Indices for the United States," the World Gold Council, pp. 1-35, Nov. 2001.
  14. Jeremy C. Goh, Fuwei Jiang, Jun Tu, Yuchen Wang, "Can US economic variables predict the Chinese stock market?," Pacific-Basin Finance Journal, Vol. 22, pp. 69-87, April. 2013. DOI: 10.1016/j.pacfin.2012.10.002
  15. Athanasios Koulakiotis, Apostolis Kiohos & Vassilios Babalos, "Exploring the interaction between stock price index and exchange rates: an asymmetric threshold approach," Applied Economics, Vol. 47, No. 13, pp. 1273-1285, Jan. 2015. DOI: 10.1080/00036846.2014.990618
  16. MY, NQ. & SAYİM, M. "The Impact of Economic Factors on the Foreign Exchange Rates between USA and Four Big Emerging Countries: China, India, Brazil and Mexico," International Finance and Banking, Vol. 3, No. 1, pp. 11-43, Feb. 2016. DOI: 10.5296/ifb.v3i1.9108
  17. Tsagkanos, Athanasios G. and Costas Siriopoulos. "A long-run relationship between stock price index and exchange rate: A structural nonparametric cointegrating regression approach," Journal of International Financial Markets, Institutions and Money, Vol. 25, pp. 106-118, July. 2013. DOI: 10.1016/j.intfin.2013.01.008
  18. Schumaker, Robert P. and Hsinchun Chen. "Textual analysis of stock market prediction using breaking financial news: The AZFin text system," ACM Trans. Information System, Vol. 27, No. 12 pp. 11-19, March 2009. DOI: 10.1145/1462198.1462204
  19. Gilbert, Eric and Karrie Karahalios. "Widespread Worry and the Stock Market," International AAAI Conference on Weblogs and Social Media, Jan. 2010.
  20. Bollen, Johan, Huina Mao and Xiao-Jun Zeng. "Twitter mood predicts the stock market," Journal of Computational Science. Vol. 2, No. 1, pp. 1-8, March. 2011. DOI: 10.1016/j.jocs.2010.12.007
  21. Lavrenko, Victor, Matthew D. Schmill, Dawn J. Lawrie, Paul Ogilvie, David D. Jensen and James Allan. "Mining of Concurrent Text and Time Series," KDD-2000 Workshop on Text Mining, Vol. 6, pp. 37-44, 2000.
  22. Hutto, Clayton J. and Eric Gilbert. "VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text," International AAAI Conference on Weblogs and Social Media, Vol. 8, pp. 216-225, Jan. 2015.
  23. Devlin, Jacob, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," ArXiv: abs/1810.04805v2, May. 2019.
  24. Nohyoon Seong, Kihwan Nam, "Combining Macro-economical Effects with Sentiment Analysis for Stock Index Prediction," Entrue Journal of Information Technology, Vol. 16, No. 2, pp. 41-54, Aug. 2017.
  25. Ehsan Hoseinzade, Saman Haratizadeh, "CNNpred: CNN-based stock market prediction using a diverse set of variables," Expert Systems with Applications, Vol. 129, pp. 273-285, Sep. 2019. DOI: 10.1016/j.eswa.2019.03.029
  26. Akita, Ryo, Akira Yoshihara, Takashi Matsubara and Kuniaki Uehara. "Deep learning for stock prediction using numerical and textual information," 2016 IEEE/ACIS 15th International Conference on Computer and Information Science, Vol. 15, pp. 1-6, June. 2016. DOI: 10.1109/ICIS.2016.7550882.
  27. Kalchbrenner, Nal, Edward Grefenstette and Phil Blunsom. "A Convolutional Neural Network for Modeling Sentences," ArXiv:1404.2188, April. 2014.
  28. Siami-Namini, Sima and Akbar Siami Namin. "Forecasting Economics and Financial Time Series: ARIMA vs. LSTM," ArXiv: abs/1803.06386, Mar. 2018.
  29. Mcnally, Sean, Jason T. Roche and Simon Caton. "Predicting the Price of Bitcoin Using Machine Learning," Euromicro International Conference on Parallel, Distributed and Network-based Processing, Vol. 26, pp. 339-343, March. 2018. DOI: 10.1109/PDP2018.2018.00060
  30. Lipton, Zachary Chase. "A Critical Review of Recurrent Neural Networks for Sequence Learning," ArXiv: abs/1506.00019, May. 2015.
  31. Christopher. Olah, http://colah.github.io/posts/2015-08-Understanding-LSTMs/