DOI QR코드

DOI QR Code

Prediction of the Movement Directions of Index and Stock Prices Using Extreme Gradient Boosting

익스트림 그라디언트 부스팅을 이용한 지수/주가 이동 방향 예측

  • 김형도 (한양사이버대학교 경영정보학과)
  • Received : 2018.08.10
  • Accepted : 2018.09.18
  • Published : 2018.09.28

Abstract

Both investors and researchers are attentive to the prediction of stock price movement directions since the accurate prediction plays an important role in strategic decision making on stock trading. According to previous studies, taken together, one can see that different factors are considered depending on stock markets and prediction periods. This paper aims to analyze what data mining techniques show better performance with some representative index and stock price datasets in the Korea stock market. In particular, extreme gradient boosting technique, proving itself to be the fore-runner through recent open competitions, is applied to the prediction problem. Its performance has been analyzed in comparison with other data mining techniques reported good in the prediction of stock price movement directions such as random forests, support vector machines, and artificial neural networks. Through experiments with the index/price datasets of 12 years, it is identified that the gradient boosting technique is the best in predicting the movement directions after 1 to 4 days with a few partial equivalence to the other techniques.

주가 이동 방향의 정확한 예측이 주식 매매에 관한 전략적 의사결정에 중요한 역할을 할 수 있기 때문에 투자자와 연구자 모두의 관심이 높다. 주가 이동 방향에 관한 기존 연구들을 종합해보면, 주식 시장에 따라서 그리고 예측 기간에 따라서 다양한 변수가 고려되고 있음을 알 수 있다. 이 연구에서는 한국 주식 시장을 대표하는 지수와 주식들을 대상으로 이동 방향 예측 기간에 따라서 어떤 데이터마이닝 기법의 성능이 우수한 것인지를 분석하고자 하였다. 특히, 최근 공개경쟁에서 활발히 사용되며 그 우수성이 입증되고 있는 익스트림 그라디언트 부스팅 기법을 주가 이동 방향 예측 문제에 적용하고자 하였으며, SVM, 랜덤 포리스트, 인공 신경망과 같이 기존 연구에서 우수한 것으로 보고된 데이터마이닝 기법들과 비교하여 분석하였다. 12년간 데이터를 사용하여 1일 후에서 5일 후까지의 이동 방향을 예측하는 실험을 통해서, 예측 기간과 종목에 따라서 선택된 변수들에 차이가 있으며, 1-4일 후 예측에서는 익스트림 그라디언트 부스팅이 다른 기법들과 부분적으로 동등함을 가지면서도 가장 우수함을 확인하였다.

Keywords

References

  1. M. Ballings, D. V. Poel, N. Hespeels, and R. Gryp, "Evaluating Multiple Classifiers for Stock Price Direction Prediction," Expert Systems with Applications, Vol.42, pp.7046-7056, 2015. https://doi.org/10.1016/j.eswa.2015.05.013
  2. B. G. Malkiel and E. F. Fama, "Efficient Capital Markets: A Review of Theory and Empirical Work," The Journal of Finance, Vol.25, No.2, pp.383-417, 1970. https://doi.org/10.1111/j.1540-6261.1970.tb00518.x
  3. S. B. Imandoust and M. Bolandraftar, "Forecasting the Direction of Stock Market Index Movement Using Three Data Mining Techniques: the Case of Tehran Stock Exchange," Int'l Journal of Engineering Research and Applications, Vol.4, No.6, pp.106-117, June. 2014.
  4. Y. Kara, M. A. Boyacioglu, and O. K. Baykan, "Predicting Direction of Stock Price Index Movement Using Artificial Neural Networks and Support Vector Machines," Expert Systems with Applications, Vol.38, pp.5311-5319, May 2011. https://doi.org/10.1016/j.eswa.2010.10.027
  5. W. Huang, Y. Nakamori, and S. Y. Wang, "Forecasting Stock Market Movement Direction with Support Vector Machine," Computers and Operations Research, Vol.32, pp.2513-2522, 2005.
  6. T. Manojlovic and I. Stajduhar, "Predicting Stock Market Trends Using Random Forests: A Sample of the Zagreb Stock Exchange," Proc. of MIPRO 2015, Opatija, Croatia, pp.1189-1193, 2015.
  7. X. Zhong and D. Enke, "Forecasting Daily Stock Market Return Using Dimensionality Reduction," Expert Systems with Applications, Vol.67, pp.126-139, 2017. https://doi.org/10.1016/j.eswa.2016.09.027
  8. T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," Proc. of the KDD '16 of the 22nd ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, San Francisco, USA, pp.785-794, August 13-17, 2016.
  9. B. E. Boser, I. M. Guyon, and V. N. Vapnik, "A Taining Agorithm for Optimal Margin Classifiers," Proc. of the Fifth Annual Workshop on Computational Learning Theory," Pittsburgh, USA, pp.144-152, July 27-29, 1992.
  10. J. Schmidhuber, "Deep Learning in Neural Networks: An Overview," Neural Networks, Vol.61, pp.85-117, 2015. https://doi.org/10.1016/j.neunet.2014.09.003
  11. L. Breiman, "Random forests," Machine Learning, Vol.45, No.1, pp.5-32, 2001. https://doi.org/10.1023/A:1010933404324
  12. J. H. Friedman, "Greedy Function Approximation: A Gradient Boosting Machine," Vol.29, No.5 pp.1189-1232, 2001. https://doi.org/10.1214/aos/1013203451
  13. K. Kim, "Financial Time Series Forecasting Using Support Vector Machines, Neurocomputing," Vol.55, pp.307-319, 2003. https://doi.org/10.1016/S0925-2312(03)00372-2
  14. K. Manish and M. Thenmozhi, "Forecasting Stock Index Movement: A Comparison of Support Vector Machines and Random Forest," Proceedings of Ninth Indian Institute of Capital Markets Conference, Mumbai, India, http://ssrn.com/abstract=876544, 2005.
  15. J. Patel, S. Shah, P. Thakkar, and K. Kotecha, "Predicting Stock and Stock Price Index Movement Using Trend Deterministic Data Preparation and Machine Learning Techniques," Expert Systems with Applications, Vol.42, No.1, pp.259-268, 2015. https://doi.org/10.1016/j.eswa.2014.07.040
  16. F. Provost, T. Fawcett, and R. Kohavi, "The Case against Accuracy Estimation for Comparing Induction Algorithms," Proc. of the Fifteenth Int'l Conf. on Machine Learning, Madison, USA, pp.445-453, July 24-27, 1998.
  17. http://finance.naver.com/
  18. B. Shahrirai, "Taking the Human Out of the Loop: A Review of Bayesian Optimization," Proceedings of the IEEE, Vol.104, No.1, pp.148-175, 2016. https://doi.org/10.1109/JPROC.2015.2494218