참고문헌
- Q. Yang, J. B. Vance, and S. Jagannathan, 'Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks,' IEEE Transactions on Systems, Man, and Cybernetics Part B: Cybernetics, vol. 38, no. 4, pp. 994-1001, 2008 https://doi.org/10.1109/TSMCB.2008.926607
- J. Valasek, J. Doebbler, M. D. Tandale, and A. J. Meade, 'Improved adaptive-reinforcement learning control for morphing unmanned air vehicles,' IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 38, no. 4, pp. 1014-1020, 2008 https://doi.org/10.1109/TSMCB.2008.922018
- K.-H. Park, Y.-J. Kim, and J.-H. Kim, 'Modular Q-learning based multi-agent cooperation for robot soccer,' Robotics and Autonomous Systems, vol. 35, no. 2, pp. 109-122, 2001
- J. Moody and M. Saffell, 'Learning to trade via direct reinforcement,' IEEE Transactions on Neural Networks, vol. 12, no. 4, pp. 875-889, 2001 https://doi.org/10.1109/72.935097
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, 1998
- H. R. Berenji and D. Vengerov, 'A convergent actor-critic-based RFL algorithm with application to power management of wireless transmitters', IEEE Transactions on Fuzzy Systems, vol. 11, no. 4, August, 2003
- X. Xu, H. He, and D. Hu, 'Efficient reinforcement learning using recursive least-squares methods', Journal of Artificial Intelligent Research, vol. 16, pp. 259-292, 2002
- R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, 'Policy gradient methods for reinforcement learning with function approximation', Advances in Neural Information Processing Systems, vol. 12, pp. 1057-1063, 2000
- V. Konda and J. N. Tsitsiklis, 'Actor-Critic Algorithms', SIAM Journal on Control and Optimization, vol. 42. no. 4, pp. 1143-1166, 2003 https://doi.org/10.1137/S0363012901385691
- J. Peters, S. Vijayakumar, and S. Schaal, 'Reinforcement learning for humanoid robotics', In Proceedings of the Third IEEE-RAS International Conference on Humanoid Robots, 2003
- J. Park, J. Kim, and D. Kang. 'An RLS-based natural actor-critic algorithm for locomotion of a two-linked robot arm', Lecture Notes in Artificial Intelligence, vol. 3801, pp. 65-72, December, 2005
- H. Kimura, K. Mivazaki, and S. Kobayashi, 'Reinforcement learning in POMDPs with function approximation', In Proceedings of the 14th International Conference on Machine Learning(ICML 1997), pp. 152-160, 1997
- 김종호, 강화학습 알고리즘을 이용한 시스템 제어에 대한 연구, 고려대학교 제어계측공학과 석사학위논문, 2005
- L. X. Wang, Adaptive Fuzzy Systems and Control: Design and Stability Analysis, Prentice-Hall, 1994
- 박종진, 최규석, 퍼지 제어 시스템, 교우사, 2001
- T. Takagi and M. Sugeno, 'Fuzzy identification of systems and its applications to modeling and control,' IEEE Transactions on Systems, Man, and Cybernetics, vol. 15, pp. 116-132, 1985
- 박주영, 정규백, 문영준, '강화학습에 의해 학습된 기는 로봇의 성능 비교', 한국 퍼지 및 지능시스템학회 논문집, 17권, 1호, pp. 33-36, 2007