1 |
A AI-Tamimi, M. Abu-Khalaf and F.L. Lewis, 'Model-Free Q-Learning Designs for Discrete-Time Zero-Sum Games with Application to H-Infinity Control,' Automatica, vol.43, no.3, pp.473-482, 2007
DOI
ScienceOn
|
2 |
P. Werbos, 'Neural networks for control and system identification', Proc. of CDC, 1989
|
3 |
R.S. Sutton and A.G. Barto. Reinforcement Learning-An introduction, MIT Press, Cambridge, 1998
|
4 |
S.J. Bradtke, B.E. Ydstie and A.G. Barto, 'Adaptive Linear Quadratic Control Using Policy Iteration, Proc. of ACC, pp.3475-3476, 1994
DOI
|
5 |
M. Abu-Khalaf and F.L. Lewis, 'Nearly optimal controls laws for nonlinear systems with saturating actuators using a neural network HJB approach,' Automatica ,vol. 41, no.5, pp.779-791, 2005
DOI
ScienceOn
|
6 |
J. Si, A. Barto, W. Powel and D. Wunch, Handbook of Learning and Approximate Dynamic Programming, John Wiley, New Jersey, 2004
|
7 |
K. Zhou and J.C. Doyle. Essentials of robust control, Prentice-Hall, 1997
|
8 |
S. Boyd, L. El Ghaoui, E. Feron and V. Balakrishnan, Linear matrix inequalities in systems and control theory, Philadelphia, PA: SIAM, 1994
|
9 |
D.P. Bertsekas and J.N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, MA1996
|
10 |
R.A Howard, Dynamic Programming and Markov Processes, MIT Press, Cambridge, 1960
|
11 |
C.J. Watkins. Learning from delayed rewards, Ph.D. Thesis, University of Cambridge, England, 1989
|
12 |
G. Saridis and C.S. Lee, 'An Approximation Theory of optimal Control for Trainable Manipulators,' IEEE Trans. Systems, Man, Cybernetics, vol.9, no.3, pp.152-159, 1979
DOI
ScienceOn
|
13 |
P.J. Werbos, 'Approximate dynamic programming for real-time control and neural modeling,' Handbook of Intelligent Control, edited by D.A White and D.A Sofge, New York: Van Nostrand Reinhold, 1992
|