Browse > Article

Model-free $H_{\infty}$ Control of Linear Discrete-time Systems using Q-learning and LMI Based on I/O Data  

Kim, Jin-Hoon (충북대 전자정보대)
Lewis, F.L. (미국 UTA 전기공학과)
Publication Information
The Transactions of The Korean Institute of Electrical Engineers / v.58, no.7, 2009 , pp. 1411-1417 More about this Journal
Abstract
In this paper, we consider the design of $H_{\infty}$ control of linear discrete-time systems having no mathematical model. The basic approach is to use Q-learning which is a reinforcement learning method based on actor-critic structure. The model-free control design is to use not the mathematical model of the system but the informations on states and inputs. As a result, the derived iterative algorithm is expressed as linear matrix inequalities(LMI) of measured data from system states and inputs. It is shown that, for a sufficiently rich enough disturbance, this algorithm converges to the standard $H_{\infty}$ control solution obtained using the exact system model. A simple numerical example is given to show the usefulness of our result on practical application.
Keywords
Model-free $H_{\infty}$ control; Linear discrete-time system; I/O data; Q-learning; LMI;
Citations & Related Records

Times Cited By SCOPUS : 0
연도 인용수 순위
  • Reference
1 A AI-Tamimi, M. Abu-Khalaf and F.L. Lewis, 'Model-Free Q-Learning Designs for Discrete-Time Zero-Sum Games with Application to H-Infinity Control,' Automatica, vol.43, no.3, pp.473-482, 2007   DOI   ScienceOn
2 P. Werbos, 'Neural networks for control and system identification', Proc. of CDC, 1989
3 R.S. Sutton and A.G. Barto. Reinforcement Learning-An introduction, MIT Press, Cambridge, 1998
4 S.J. Bradtke, B.E. Ydstie and A.G. Barto, 'Adaptive Linear Quadratic Control Using Policy Iteration, Proc. of ACC, pp.3475-3476, 1994   DOI
5 M. Abu-Khalaf and F.L. Lewis, 'Nearly optimal controls laws for nonlinear systems with saturating actuators using a neural network HJB approach,' Automatica ,vol. 41, no.5, pp.779-791, 2005   DOI   ScienceOn
6 J. Si, A. Barto, W. Powel and D. Wunch, Handbook of Learning and Approximate Dynamic Programming, John Wiley, New Jersey, 2004
7 K. Zhou and J.C. Doyle. Essentials of robust control, Prentice-Hall, 1997
8 S. Boyd, L. El Ghaoui, E. Feron and V. Balakrishnan, Linear matrix inequalities in systems and control theory, Philadelphia, PA: SIAM, 1994
9 D.P. Bertsekas and J.N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, MA1996
10 R.A Howard, Dynamic Programming and Markov Processes, MIT Press, Cambridge, 1960
11 C.J. Watkins. Learning from delayed rewards, Ph.D. Thesis, University of Cambridge, England, 1989
12 G. Saridis and C.S. Lee, 'An Approximation Theory of optimal Control for Trainable Manipulators,' IEEE Trans. Systems, Man, Cybernetics, vol.9, no.3, pp.152-159, 1979   DOI   ScienceOn
13 P.J. Werbos, 'Approximate dynamic programming for real-time control and neural modeling,' Handbook of Intelligent Control, edited by D.A White and D.A Sofge, New York: Van Nostrand Reinhold, 1992