R-Trader: An Automatic Stock Trading System based on Reinforcement learning

;;;;

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

Volume 29 Issue 11
/
Pages.785-794
/
2002
/
1229-6848(pISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

R-Trader: An Automatic Stock Trading System based on Reinforcement learning

R-Trader: 강화 학습에 기반한 자동 주식 거래 시스템

이재원 (성신여자대학교 컴퓨터정보학부) ;
김성동 (한성대학교 컴퓨터정보학부) ;
이종우 (광운대학교 컴퓨터공학부) ;
채진석 (인천대학교 컴퓨터공학과)

Published : 2002.12.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Automatic stock trading systems should be able to solve various kinds of optimization problems such as market trend prediction, stock selection, and trading strategies, in a unified framework. But most of the previous trading systems based on supervised learning have a limit in the ultimate performance, because they are not mainly concerned in the integration of those subproblems. This paper proposes a stock trading system, called R-Trader, based on reinforcement teaming, regarding the process of stock price changes as Markov decision process (MDP). Reinforcement learning is suitable for Joint optimization of predictions and trading strategies. R-Trader adopts two popular reinforcement learning algorithms, temporal-difference (TD) and Q, for selecting stocks and optimizing other trading parameters respectively. Technical analysis is also adopted to devise the input features of the system and value functions are approximated by feedforward neural networks. Experimental results on the Korea stock market show that the proposed system outperforms the market average and also a simple trading system trained by supervised learning both in profit and risk management.

자동 주식 거래 시스템은 시장 추세의 예측, 투자 종목의 선정, 거래 전략 등 매우 다양한 최적화 문제를 통합적으로 해결할 수 있어야 한다. 그러나 기존의 감독 학습 기법에 기반한 거래 시스템들은 이러한 최적화 요소들의 효과적인 결합에는 큰 비중을 두지 않았으며, 이로 인해 시스템의 궁극적인 성능에 한계를 보인다. 이 논문은 주가의 변동 과정이 마르코프 의사결정 프로세스(MDP: Markov Decision Process)라는 가정 하에, 강화 학습에 기반한 자동 주식 거래 시스템인 R-Trader를 제안한다. 강화 학습은 예측과 거래 전략의 통합적 학습에 적합한 학습 방법이다. R-Trader는 널리 알려진 두 가지 강화 학습 알고리즘인 TB(Temporal-difference)와 Q 알고리즘을 사용하여 종목 선정과 기타 거래 인자의 최적화를 수행한다. 또한 기술 분석에 기반하여 시스템의 입력 속성을 설계하며, 가치도 함수의 근사를 위해 인공 신경망을 사용한다. 한국 주식 시장의 데이타를 사용한 실험을 통해 제안된 시스템이 시장 평균을 초과하는 수익을 달성할 수 있고, 수익률과 위험 관리의 두 가지 측면 모두에서 감독 학습에 기반한 거래 시스템에 비해 우수한 성능 보임을 확인한다.

Keywords

References

B. G. Malkiel, A Random Walk Down Wall Street, Norton, New York, 1996
S. T. Chou, H. Hsu, C. Yang, F. Lai, 'A stock selection DSS combining AI and technical analysis,' Annals of Operations Research 75. pp.335-353, 1997 https://doi.org/10.1023/A:1018923916424
B. Zhang, R. Coggins, M. A. Jabri, D. Dersch, B. Flower, 'Multiresolution Forecasting for Futures Trading Using Wavelet Decompositions,' IEEE Trans. Neural Networks, vol. 12, pp.765-775, 2001 https://doi.org/10.1109/72.935090
R. J. Kuo, 'A Decision Support System for the Stock Market through Integration of Fuzzy Neural Networks and Fuzzy Delphi,' Applied Intelligence, 6., pp.501-520, 1998
R. Neuneier, 'Enhancing Q-Learning for Optimal Asset Allocation,' Advances in Neural Information Processing Systems 10, MIT Press, Cambridge. pp.936-942, 1998
R. Neuneier, O. Mihatsch, 'Risk Sensitive Reinforcement Learning,' Advances in Neural Information Processing Systems 11, MIT Press, Cambridge, pp.1031-1037, 1999
J. Moody, Y. Wu, Y. Liao, M. Saffell, 'Performance Functions and Reinforcement Learning for Trading Systems and Portfolios,' Journal of Forecasting, 17(5-6), pp.441-470, 1998 https://doi.org/10.1002/(SICI)1099-131X(1998090)17:5/6<441::AID-FOR707>3.0.CO;2-#
J. Moody, M. Saffell, 'Learning to Trade via Direct Reinforcement,' IEEE Transactions on Neural Networks, 12(4), pp.875-889, 2001 https://doi.org/10.1109/72.935097
G. Xiu, C. Laiwan, 'Algorithm for Trading and Portfolio Management Using Q-learning and Sharpe Ratio Maximization,' Proceedings of ICONIP 2000, Korea, pp.832--837, 2000
R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, 1998
C. J. Watkins, Learning from Delayed Rewards. Ph.D. thesis, Cambridge University, 1989
M. H. Kalos, P. A. Whitlock, Monte Carlo Methods, Wiley, New York, 1998
R. Neuneier, 'Optimal Asset Allocation Using Adaptive Dynamic Programming,' Advances in Neural Information Processing Systems 8, Cambridge, MA: MIT Press, pp.953-958, 1996
R. D. Edwards and J. Magee, Technical Analysis of Stock Trends, John Magee, Inc., 1974
S. Nison, Japanese Candlestick Charting Techniques, New York, NY:New York Institute of Finance, 1991
T. Hellstroem, A Random Walk through the Stock Market, Ph.D. theis, University of Umea, Sweden, 1998

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

R-Trader: An Automatic Stock Trading System based on Reinforcement learning

R-Trader: 강화 학습에 기반한 자동 주식 거래 시스템

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)