Q-Learning Policy Design to Speed Up Agent Training |
Yong, Sung-jung
(Department of Computer Science and Engineering, Korea University of Technology and Education)
Park, Hyo-gyeong (Department of Computer Science and Engineering, Korea University of Technology and Education) You, Yeon-hwi (Department of Computer Science and Engineering, Korea University of Technology and Education) Moon, Il-young (Department of Computer Science and Engineering, Korea University of Technology and Education) |
1 | V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, "Playing atari with deep reinforcement learning," Proceeding of the 2013 Conference on Neural Information Processing Systems Deep Learning Workshop, California: USA, 2013. |
2 | X. Wang, L. Jin, and H. Wei, "The shortest path planning based on reinforcement learning," Journal of Physics: Conference Series, vol. 1584, 012006, 2020. DOI |
3 | R. S. Sutton and A. G. Barto, "Reinforcement learning: an introduction," MIT Press Cambridge, vol. 135, 1998. |
4 | C. Watkins and P. Dayan, "Q-learning," Machine Learning, vol. 8, pp. 279-292, May 1992. |
5 | J. Clifton and E. Laber, "Q-learning: theory and applications", Annual Review of Statistics and Its Application, vol. 7, pp. 279-301, 2020. DOI |
6 | G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, "OpenAI Gym," Jun. 2016, arXiv [Online]. Available: https://arxiv.org/ abs/1606.01540v1. |