[KSCI] Korea Science Citation Index Service

Model-free $H_{\infty}$ Control of Linear Discrete-time Systems using Q-learning and LMI Based on I/O Data

Kim, Jin-Hoon (충북대 전자정보대)
Lewis, F.L. (미국 UTA 전기공학과)

Publication Information

The Transactions of The Korean Institute of Electrical Engineers / v.58, no.7, 2009 , pp. 1411-1417 More about this Journal

Abstract

In this paper, we consider the design of $H_{\infty}$ control of linear discrete-time systems having no mathematical model. The basic approach is to use Q-learning which is a reinforcement learning method based on actor-critic structure. The model-free control design is to use not the mathematical model of the system but the informations on states and inputs. As a result, the derived iterative algorithm is expressed as linear matrix inequalities(LMI) of measured data from system states and inputs. It is shown that, for a sufficiently rich enough disturbance, this algorithm converges to the standard $H_{\infty}$ control solution obtained using the exact system model. A simple numerical example is given to show the usefulness of our result on practical application.

Keywords

Model-free $H_{\infty}$ control; Linear discrete-time system; I/O data; Q-learning; LMI;

Citations & Related Records

Times Cited By SCOPUS : 0

Reference

1	A AI-Tamimi, M. Abu-Khalaf and F.L. Lewis, 'Model-Free Q-Learning Designs for Discrete-Time Zero-Sum Games with Application to H-Infinity Control,' Automatica, vol.43, no.3, pp.473-482, 2007 DOI ScienceOn
2	P. Werbos, 'Neural networks for control and system identification', Proc. of CDC, 1989
3	R.S. Sutton and A.G. Barto. Reinforcement Learning-An introduction, MIT Press, Cambridge, 1998
4	S.J. Bradtke, B.E. Ydstie and A.G. Barto, 'Adaptive Linear Quadratic Control Using Policy Iteration, Proc. of ACC, pp.3475-3476, 1994 DOI
5	M. Abu-Khalaf and F.L. Lewis, 'Nearly optimal controls laws for nonlinear systems with saturating actuators using a neural network HJB approach,' Automatica ,vol. 41, no.5, pp.779-791, 2005 DOI ScienceOn
6	J. Si, A. Barto, W. Powel and D. Wunch, Handbook of Learning and Approximate Dynamic Programming, John Wiley, New Jersey, 2004
7	K. Zhou and J.C. Doyle. Essentials of robust control, Prentice-Hall, 1997
8	S. Boyd, L. El Ghaoui, E. Feron and V. Balakrishnan, Linear matrix inequalities in systems and control theory, Philadelphia, PA: SIAM, 1994
9	D.P. Bertsekas and J.N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, MA1996
10	R.A Howard, Dynamic Programming and Markov Processes, MIT Press, Cambridge, 1960
11	C.J. Watkins. Learning from delayed rewards, Ph.D. Thesis, University of Cambridge, England, 1989
12	G. Saridis and C.S. Lee, 'An Approximation Theory of optimal Control for Trainable Manipulators,' IEEE Trans. Systems, Man, Cybernetics, vol.9, no.3, pp.152-159, 1979 DOI ScienceOn
13	P.J. Werbos, 'Approximate dynamic programming for real-time control and neural modeling,' Handbook of Intelligent Control, edited by D.A White and D.A Sofge, New York: Van Nostrand Reinhold, 1992

KSCI

Model-free Control of Linear Discrete-time Systems using Q-learning and LMI Based on I/O Data 입출력 데이터 기반 Q-학습과 LMI를 이용한 선형 이산 시간 시스템의 모델-프리 제어기 설계

Model-free $H_{\infty}$ Control of Linear Discrete-time Systems using Q-learning and LMI Based on I/O Data