[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5391/JKIIS.2015.25.4.319

Investigations on data-driven stochastic optimal control and approximate-inference-based reinforcement learning methods

Park, Jooyoung (Department of Control and Instrumentation Engineering, Korea University)
Ji, Seunghyun (Department of Control and Instrumentation Engineering, Korea University)
Sung, Keehoon (Department of Control and Instrumentation Engineering, Korea University)
Heo, Seongman (Department of Control and Instrumentation Engineering, Korea University)
Park, Kyungwook (School of Business Administration, Korea University)

Publication Information

Journal of the Korean Institute of Intelligent Systems / v.25, no.4, 2015 , pp. 319-326 More about this Journal

Abstract

Recently in the fields o f stochastic optimal control ( SOC) and reinforcemnet l earning (RL), there have been a great deal of research efforts for the problem of finding data-based sub-optimal control policies. The conventional theory for finding optimal controllers via the value-function-based dynamic programming was established for solving the stochastic optimal control problems with solid theoretical background. However, they can be successfully applied only to extremely simple cases. Hence, the data-based modern approach, which tries to find sub-optimal solutions utilizing relevant data such as the state-transition and reward signals instead of rigorous mathematical analyses, is particularly attractive to practical applications. In this paper, we consider a couple of methods combining the modern SOC strategies and approximate inference together with machine-learning-based data treatment methods. Also, we apply the resultant methods to a variety of application domains including financial engineering, and observe their performance.

Keywords

Data-driven methods; Stochastic optimal control; Approximate inference; Machine learning; Financial engineering;

Citations & Related Records

Reference

1	D.P. Bertsekas, Dynamic Programming and Optimal Control, vol. II, 4th edition, Athena Scientific, 2012.
2	R.S. Sutton and A.G. Barto, Reinforcement Learning: An Introduction, MIT Press, 1998.
3	D.P. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, 1996.
4	K. Rawlik, M. Toussaint and S. Vijayakumar, "On stochastic optimal control and reinforcement learning by approximate inference", Proceedings of International Conference on Robotics Science and Systems, pp. 3052-3056, 2012.
5	M.G. Azar, V. Gmez and H.J. Kappen, "Dynamic policy programming with function approximation," Proceedings of 14th International Conference on Artificial Intelligence and Statistics (AISTATS), 2011.
6	C.M. Bishop, Pattern Recognition and Learning, Springer, 2006.
7	K.P. Murphy, Machine Learning: A Probabilistic Perspective, MIT press. 2012.
8	R. Lioutikov, A. Paraschos, J. Peters and G. Neumann, "Sample-based information-theoretic stochastic optimal control", Proceedings of the International Conference on Robotics and Automation, pp. 3896-3902, 2014.
9	J. Peters, K. Mulling and Y. Altun, "Relative entropy policy search", Proceedings of the 24th National Conference on Artificial Intelligence (AAAI), pp. 1607-1612, 2010.
10	M. Dai, Q. Zhang and Q.J. Zhu, "Trend following trading under a regime switching model," SIAM Journal on Financial Mathematics, vol. 1, pp. 780-810, 2010. DOI
11	H.T. Kong, Q. Zhang and G.G. Yin, "A trend-following strategy: Conditions for optimality," Automatica, vol. 47, no. 4, pp. 661-667, 2011. DOI ScienceOn
12	J. Yu and Q. Zhang, "Optimal trend-following trading rules under a three-state regime switching model," Mathematical Control and Related Fields, vol. 2, no. 1, pp. 81-100, 2012. DOI
13	J.A. Primbs, "A control systems based look at financial engineering," Tutorial from the presentation, The Control of Financial Portfolios, 2009.
14	D.J. Higham, An Introduction to Financial Option Valuation: Mathematics, Stochastics and Computation, Cambridge University Press, 2004.
15	P. Carr, K. Ellis and V. Gupta, "Static hedging of exotic options," The Journal of Finance, vol. 53, pp. 1165-1190, 1998. DOI
16	E. Derman, D. Ergener, and I. Kani, "Static options replication," Journal of Derivatives, vol. 2, pp. 78-95, 1995. DOI
17	S. Chung, P. Shih and W. Tsai, "Static hedging and pricing american knock-out options," Journal of Derivatives, vol. 37, pp. 23-48, 2013.
18	M. Nalholm and R. Poulsen, "Static hedging of barrier options under general asset dynamics: Unification and application," Journal of Derivatives , vol. 13, pp. 46-60, 2006. DOI
19	M. Kamal, "When you cannot hedge continuously: The corrections to Black-Scholes," Goldman Sachs Equity Derivatives Research, 1998.
20	F. Trabelsi and A. Trad, "Discrete hedging in a continuous- time model," Applied Mathematical Finance, vol. 9, pp. 189-217, 2002. DOI
21	P. Carr, "Semi-static hedging of barrier options under Poission jumps," International Journal of Theoretical and Applied Finance, vol. 14, pp. 1091- 1111, 2011. DOI
22	M. Jeannin, M. Pistorius, "Pricing and hedging barrier options in a hyper-exponential additive model," International Journal of Theoretical and Applied Finance, vol. 13, pp. 657-681, 2010. DOI
23	W. Yip, D. Stephens and S. Olhede, "Hedging strategies and minimal variance portfolios for european and exotic options in a Levy market", Mathematical Finance, vol. 20, pp. 617-646, 2010. DOI
24	J. Huang, M.G. Subrahmanyam and G. Yu, "Pricing and hedging american options: A recursive integration method," The Review of Financial Studies, vol. 9, pp. 277-300, 1996. DOI
25	R.J. Frey, "Hidden Markov models with univariate Gaussian outcomes," Technical Report, Stony Brook University, 2009.
26	T. Schaul, "Benchmarking exponential natural evolution strategies on the noiseless and noisy blackbox optimization testbeds," Proceedings of GECCO' 12, 2012.
27	Y. Wang and S. Boyd, "Approximate dynamic programming via iterated Bellman inequalities," International Journal of Robust and Nonlinear Control, vol. 25, pp. 1472-1496, 2015. DOI
28	J. Park, S. Ji, K. Sung, K. Park, "Trend-following based on hidden Markov model and modern evolution strategy," Proceedings of 2015 Information and Control Symposium, pp. 52-54, 2015.

KSCI

Investigations on data-driven stochastic optimal control and approximate-inference-based reinforcement learning methods 데이터 기반 확률론적 최적제어와 근사적 추론 기반 강화 학습 방법론에 관한 고찰

Investigations on data-driven stochastic optimal control and approximate-inference-based reinforcement learning methods