A Dynamic Asset Allocation Method based on Reinforcement learning Exploiting Local Traders

지역 투자 정책을 이용한 강화학습 기반 동적 자산 할당 기법

  • 오장민 (서울대학교 컴퓨터공학부) ;
  • 이종우 (숙명여자대학교 멀티미디어학과) ;
  • 장병탁 (서울대학교 컴퓨터공학부)
  • Published : 2005.08.01

Abstract

Given the local traders with pattern-based multi-predictors of stock prices, we study a method of dynamic asset allocation to maximize the trading performance. To optimize the proportion of asset allocated to each recommendation of the predictors, we design an asset allocation strategy called meta policy in the reinforcement teaming framework. We utilize both the information of each predictor's recommendations and the ratio of the stock fund over the total asset to efficiently describe the state space. The experimental results on Korean stock market show that the trading system with the proposed meta policy outperforms other systems with fixed asset allocation methods. This means that reinforcement learning can bring synergy effects to the decision making problem through exploiting supervised-learned predictors.

본 논문에서는 패턴 기반의 다수의 주가 예측 모델에 기반한 지역 투자자의 효율적인 결합을 통해, 거래 성능을 최대화 할 수 있는 동적 자산 할당 기법을 연구하였다. 각 예측 모델이 추천한 후보 종목에 효과적인 거래 대금 비율을 할당하는 메타 정책(meta policy)이라는 자산 할당 정책을 강화 학습 틀내에서 정의하였다. 이를 위해 각 예측 모델의 추천 종목 수와 전체 자산 대비 주식 자금 비율을 동시에 활용하는 상태 공간을 설계하였다. 대한민국 주식 시장에 대한 시뮬레이션 실험을 통해, 본 논문에서 제안한 자산 할당 정책은 기존의 고정 자산 할당 방법들에 비해 우수한 성능을 보임을 제시 하였다. 이는 강화학습을 통한 지역 투자자의 결합을 통해 의사 결정 문제에서 감독자 학습 기법으로 학습된 예측 모델의시너지 효과를 거둘 수 있음을 의미한다.

Keywords

References

  1. S. M. Kendall and K. Ord, Time Series, Oxford, New York, 1997
  2. E. F. Fama, 'Multiperiod Consumption Investment Decisions.' American Economic Review, 60, pp, 163-174, 1970
  3. E. F. Fama and K. R. French, 'Dividend Yields and Expected Stock Returns,' Journal of Financial Economics, 22, pp. 3-26, 1988 https://doi.org/10.1016/0304-405X(88)90020-7
  4. B. G. Malkiel, A Random Walk Down Wall Street, Norton, New York, 1996
  5. M. A. H. Dempster, T. W. Payne, Y. Romahi, and G. W. P. Thompson, 'Computational Learning Techniques for Intraday FX Trading Using Popular Technical Indicators.' IEEE Transactions on Neural Networks, 12(4), pp. 744-754, 2001 https://doi.org/10.1109/72.935088
  6. A. Fan and M. Palaniswami, 'Stock Selection Using Support Vector Machines,' In Proceedings of International Joint Conference on Neural Networks, pp. 1793-1798, 2001 https://doi.org/10.1109/IJCNN.2001.938434
  7. S. D. Kim, J. W. Lee, J. Lee, and J.-S. Chae, 'A Two-Phase Stock Trading System Using Distributional Differences,' Proceedings of International Conference on Database and Expert Systems Applications, pp. 143-152, 2002
  8. E. W. Saad, D. V. Prokhorov, D. C. Wunsch II, 'Comparative Study of Stock Trend Prediction Using Time Delay, Recurrent and Probabilistic Neural Networks,' IEEE Transactions on Neural Networks, 9(6), pp. 1456-1470, 1998 https://doi.org/10.1109/72.728395
  9. J. W. Lee and J. O, 'A Multi-agent Q-Iearning Framework for Optimizing Stock Trading Systems,' Proceedings of International Conference on Database and Expert Systems Applications, pp. 153-162, 2002
  10. J. Moody and M. Saffell, 'Learning to Trade via Direct Reinforcement,' IEEE Transactions on Neural Networks, 12(4), pp. 875-889, 2001 https://doi.org/10.1109/72.935097
  11. R. Neuneier, 'Risk Sensitive Reinforcement Learning,' Advances in Neural Information Processing Systems, pp, 1031-1037, MIT Press, Cambridge, 1999
  12. J. O. J. W. Lee, and B.-T. Zhang, 'Stock Trading System Using Reinforcement Learning with Cooperative Agents,' In Proceedings of International Conference on Machine Learning, pp. 451-458, Morgan Kaufmann, 2002
  13. H. Li, C. H. Dagli and D. Enke, A Comparison Study of Reinforcement Schemes on a Series-based Stock Price Forecasting Task, IEEE transactions on Neural Networks, Submitted, 2005
  14. R. S. Sutton and A. G. Barto, Reinforcement Learning : An Introduction. MIT Press, Cambridge, 1998
  15. K. Hornik, M. Stinchcombe and H. White, 'Multilayer feedforward networks are universal approximators,' Neural Networks, vol. 2, pp. 359-366, 1989 https://doi.org/10.1016/0893-6080(89)90020-8