References
- M. Abu-Khalaf and F.L. Lewis, 'Nearly optimal controls laws for nonlinear systems with saturating actuators using a neural network HJB approach,' Automatica ,vol. 41, no.5, pp.779-791, 2005 https://doi.org/10.1016/j.automatica.2004.11.034
- A AI-Tamimi, M. Abu-Khalaf and F.L. Lewis, 'Model-Free Q-Learning Designs for Discrete-Time Zero-Sum Games with Application to H-Infinity Control,' Automatica, vol.43, no.3, pp.473-482, 2007 https://doi.org/10.1016/j.automatica.2006.09.019
- S.J. Bradtke, B.E. Ydstie and A.G. Barto, 'Adaptive Linear Quadratic Control Using Policy Iteration, Proc. of ACC, pp.3475-3476, 1994 https://doi.org/10.1109/ACC.1994.735224
- G. Saridis and C.S. Lee, 'An Approximation Theory of optimal Control for Trainable Manipulators,' IEEE Trans. Systems, Man, Cybernetics, vol.9, no.3, pp.152-159, 1979 https://doi.org/10.1109/TSMC.1979.4310171
- P.J. Werbos, 'Approximate dynamic programming for real-time control and neural modeling,' Handbook of Intelligent Control, edited by D.A White and D.A Sofge, New York: Van Nostrand Reinhold, 1992
- R.A Howard, Dynamic Programming and Markov Processes, MIT Press, Cambridge, 1960
- P. Werbos, 'Neural networks for control and system identification', Proc. of CDC, 1989
- C.J. Watkins. Learning from delayed rewards, Ph.D. Thesis, University of Cambridge, England, 1989
- S. Boyd, L. El Ghaoui, E. Feron and V. Balakrishnan, Linear matrix inequalities in systems and control theory, Philadelphia, PA: SIAM, 1994
- D.P. Bertsekas and J.N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, MA1996
- K. Zhou and J.C. Doyle. Essentials of robust control, Prentice-Hall, 1997
- R.S. Sutton and A.G. Barto. Reinforcement Learning-An introduction, MIT Press, Cambridge, 1998
- J. Si, A. Barto, W. Powel and D. Wunch, Handbook of Learning and Approximate Dynamic Programming, John Wiley, New Jersey, 2004