References
- A. G. Barto, R. S. Sutton, C. W. Anderson, "Neuronlike elements that can solve difficult learning control problems," IEEE Transactions on Systems Man and Cybernetics, vol. 13, pp. 835-846, 1983, .
- H. R. Berenji, D. Vengerov, "A convergent actor-criticbased FRL algorithm with application to power management of wireless transmitters," IEEE Tranactions on Fuzzy Systems, vol. 11, pp. 478-485, 2003, . https://doi.org/10.1109/TFUZZ.2003.814834
- H. Kimura, S. Kobayashi, "An analysis of actor/critic algorithms using eligibility traces: Reinforcement learning with imperfect value function," In Proceedings of the Fifteenth International Conference on Machine Learning, pp. 111-116, 1998.
- J. Park, J. Kim, D. Kang, "An RLS-based natural actorcritic algorithm for locomotion of a two-linked robot arm," Lecture Notes in Artificial Intelligence, vol. 3801, pp. 65-72, 2005.
- J. Peters, S. Vijayakumar, S. Schaal, "Reinforcement learning for humanoid robotics," In Proceedings of the Third IEEE-RAS International Conference on Humanoid Robots (Humanoids2003), 2003.
- P. Thomas, M. Branicky, N. Kobori, K. Suzuki, P. Hartono, S. Hashimoto, "Learning to control a joint driven double inverted pendulum using nested actor/critic algorithm," In Proceedings of the 9th International Conference on Neural Information Processing, 2002.
- J. Park, D. Kang, J. Lee, D. Nam, "An actor-critic algorithm using kernel-based least-squares estimation: An application to robot locomotion," In Proceedings of 2009 CACS International Automatic Control Conference, 2009.
- R. S. Sutton, A. G. Barto, Reinforcement Learning: an Introduction, MIT Press, Cambridge, 1998.
- B. Scholkopf, A. J. Smola, Learning with Kernels, MIT Press, Cambridge, 2002.
- J. Park, D. Nam, J. Lee, "Some observations on kernelbased function approximation steps for actor-critic methods," In Proceedings of KIIS Fall Conference, vol. 19, no. 2, pp. 79-82, 2009.
- J. A. Boyan, "Technical update: Least-squares temporal difference learning," Machine Learning, vol. 49, pp. 233-246, 2002. https://doi.org/10.1023/A:1017936530646
- S. V. Vaerenbergh, J. Vıa, I. Santamarıa, "Nonlinear system identification using a new sliding-window kernel RLS algorithm," Journal of Communications, vol. 2, no. 3, pp. 1-8, 2007.
- R. S. Sutton, D. McAllester, S. Singh, Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," Advances in Neural Information Processing Systems, vol. 12, pp. 1057-1063, 1999.
- H. Kimura, K. Miyazaki, S. Kobayashi, "Reinforcement learning in POMDPs with function approximation," In Proceedings of the Fourteenth International Conference on Machine Learning, pp. 152-160, 1997.
- B. Chu, D. Kim, D. Hong, J. Park, J. T. Chung, T.-H. Kim, "Tunnel ventilation control using reinforcement learning methodology," JSME International Series C, vol. 47, no. 4, pp. 939-945, 2006.
- D. Hong, B. Chu, W. D. Kim, J. T. Chung, T.-H. Kim, "Pollution level estimation for tunnel ventilation," JSME International Series B, vol. 46, no. 2, pp. 278-286, 2003. https://doi.org/10.1299/jsmeb.46.278
- D. Kim, B. Chu, D. Hong, J. T. Chung, T.-H. Kim, "Design of alternating operation algorithm for tunnel ventilation systems," In Proceedings of the Society of Airconditioning and Refrigerating Engineering of Korea 2005 Summer Conference, pp. 872-877, 2005.