DOI QR코드

DOI QR Code

Performance Comparison of Reinforcement Learning Algorithms for Futures Scalping

해외선물 스캘핑을 위한 강화학습 알고리즘의 성능비교

  • 정득교 (경북대학교 인공지능학과) ;
  • 이세훈 (경북대학교 인공지능학과) ;
  • 강재모 (경북대학교 인공지능학과)
  • Received : 2022.08.26
  • Accepted : 2022.09.09
  • Published : 2022.09.30

Abstract

Due to the recent economic downturn caused by Covid-19 and the unstable international situation, many investors are choosing the derivatives market as a means of investment. However, the derivatives market has a greater risk than the stock market, and research on the market of market participants is insufficient. Recently, with the development of artificial intelligence, machine learning has been widely used in the derivatives market. In this paper, reinforcement learning, one of the machine learning techniques, is applied to analyze the scalping technique that trades futures in minutes. The data set consists of 21 attributes using the closing price, moving average line, and Bollinger band indicators of 1 minute and 3 minute data for 6 months by selecting 4 products among futures products traded at trading firm. In the experiment, DNN artificial neural network model and three reinforcement learning algorithms, namely, DQN (Deep Q-Network), A2C (Advantage Actor Critic), and A3C (Asynchronous A2C) were used, and they were trained and verified through learning data set and test data set. For scalping, the agent chooses one of the actions of buying and selling, and the ratio of the portfolio value according to the action result is rewarded. Experiment results show that the energy sector products such as Heating Oil and Crude Oil yield relatively high cumulative returns compared to the index sector products such as Mini Russell 2000 and Hang Seng Index.

최근 Covid-19 및 불안한 국제정세로 인한 경기 침체로 많은 투자자들이 투자의 한 수단으로써 파생상품시장을 선택하고 있다. 하지만 파생상품시장은 주식시장에 비해 큰 위험성을 가지고 있으며, 시장 참여자들의 시장에 대한 연구 역시 부족한 실정이다. 최근 인공지능 분야의 발달로 파생상품시장에서도 기계학습이 많이 활용되고 있다. 본 논문은 해외선물에 분 단위로 거래하는 스캘핑 거래의 분석을 위해 기계학습 기법 중 하나인 강화학습을 적용하였다. 데이터 세트는 증권사에서 거래되는 해외선물 상품들 중 4개 상품을 선정해, 6개월간 1분봉 및 3분봉 데이터의 종가, 이동평균선 및 볼린저 밴드 지표들을 이용한 21개의 속성으로 구성하였다. 실험에는 DNN 인공신경망 모델과 강화학습 알고리즘인 DQN(Deep Q-Network), A2C(Advantage Actor Critic), A3C(Asynchronous A2C)를 사용하고, 학습 데이터 세트와 테스트 데이터 세트를 통해 학습 및 검증 하였다. 에이전트는 스캘핑을 위해 매수, 매도 중 하나의 행동을 선택하며, 행동 결과에 따른 포트폴리오 가치의 비율을 보상으로 한다. 실험 결과 에너지 섹터 상품(Heating Oil 및 Crude Oil)이 지수 섹터 상품(Mini Russell 2000 및 Hang Seng Index)에 비해 상대적으로 높은 누적 수익을 보여 주었다.

Keywords

Acknowledgement

본 연구는 산업통상자원부(MOTIE)와 한국에너지기술평가원(KETEP)의 지원을 받아 수행한 연구 과제입니다(No. 20224000000150). 본 논문은 2022년도 정부(교육부)의 재원으로 한국연구재단의 지원을 받아 수행된 기초연구사업(No. 2020R1I1A3073651)의 지원을 받아 작성되었음.

References

  1. "'22 Capital market risk analysis report", the Financial Supervisory Service, pp. 82-87, 2022.
  2. Zhou, Feng, et al. "EMD2FNN: A strategy combining empirical mode decomposition and factorization machine based neural network for stock market trend prediction." Expert Systems with Applications, Vol 115, pp. 136-151, 2019. DOI:https://doi.org/10.1016/j.eswa.2018.07.065
  3. Mukherji, Sandip and Dhatt, Manjeet S and Kim, Yong H "A fundamental analysis of Korean stock returns." Financial Analysts Journal, Vol 53, No. 3, pp. 75-80, 1997. DOI:https://doi.org/10.2469/faj.v53.n3.2086
  4. Achelis, S. B, "Technical Analysis from A to Z.",McGraw Hill, 2001.
  5. Orlando, J. M. "Algorithmic presentation to european central bank-BNP Paribas.", 2007.
  6. El Akraoui, Bouchra, and Cherki Daoui. "Deep Reinforcement Learning for Bitcoin Trading." International Conference on Business Intelligence. Springer, Cham, pp. 82-93, 2022.DOI: 10.1007/978-3-031-06458-6_7
  7. J.Y. Park, S.S. Hong, MG.. Park, H. Lee, "An Implementation of Stock Investment Service based on Reinforcement Learning", The Journal of the Convergence on Culture Technology (JCCT), Vol. 7. No. 4, pp. 807-814. November 2021, DOI:https://doi.org/10.17703/JCCT.2021.7.4.807
  8. Azhikodan, Akhil Raj, Anvitha GK Bhat, and Mamatha V. Jadhav. "Stock trading bot using deep reinforcement learning." Innovations in Computer Science and Engineering, pp. 41-49, 2019. DOI: 10.1007/978-981-10-8201-6_5
  9. Moody, John, et al. "Performance functions and reinforcement learning for trading systems and portfolios.", Journal of Forecasting, Vol. 17, No. 5-6, pp. 441-470, 1998. DOI:https://doi.org/10.1002/(SICI)1099-131X(1998090)17:5/6<441::AID-FOR707>3.0.CO;2-%23
  10. Shui-Ling, Y. U., and Zhe Li. "Stock price prediction based on ARIMA-RNN combined model." 4th International Conference on Social Science (ICSS 2017), pp 1-6, 2017.
  11. Hirsa, Ali, et al. "Deep reinforcement learning on a multi-asset environment for trading." arXiv preprint arXiv:2106.08437, 2021, DOI:https://doi.org/10.48550/arXiv.2106.08437
  12. Zejnullahu, Frensi, Maurice Moser, and Joerg Osterrieder. "Applications of Reinforcement Learning in Finance--Trading with a Double Deep Q-Network." arXiv preprint arXiv:2206.14267, 2022. DOI:https://doi.org/10.48550/arXiv.2206.14267
  13. Bollinger, John. "Using bollinger bands." Stocks & Commodities, Vol. 10, No. 2, pp. 47-51, 1992.
  14. Mnih, Volodymyr, et al. "Playing atari with deep reinforcement learning." arXiv preprint arXiv: 1312.5602, 2013. DOI:https://doi.org/10.48550/arXiv.1312.5602
  15. Konda, Vijay, and John Tsitsiklis. "Actor-critic algorithms." Advances in neural information processing systems, Vol. 12, 1999.
  16. MNIH, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." In: International conference on machine learning. PMLR, pp. 1928-1937, 2016.
  17. Xiong, Zhuoran, et al. "Practical deep reinforcement learning approach for stock trading". arXiv preprint arXiv:1811.07522, 2018. DOI:https://doi.org/10.48550/arXiv.1811.07522
  18. Liu, Xiao-Yang, et al. "FinRL: A deep reinforcement learning library for automated stock trading in quantitative finance." arXiv preprint arXiv:2011.09607, 2020. DOI:https://doi.org/10.48550/arXiv.2011.09607
  19. Kim, Sung-Hyeock, et al. "Influence on overfitting and reliability due to change in training data". International Journal of Advanced Culture Technology(IJACT), Vol 5. No. 2, pp 82-89, June 2017, DOI: https://doi.org/10.17703/IJACT.2017.5.2.82