[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.14702/JPEE.2022.219

Q-Learning Policy Design to Speed Up Agent Training

Yong, Sung-jung (Department of Computer Science and Engineering, Korea University of Technology and Education)
Park, Hyo-gyeong (Department of Computer Science and Engineering, Korea University of Technology and Education)
You, Yeon-hwi (Department of Computer Science and Engineering, Korea University of Technology and Education)
Moon, Il-young (Department of Computer Science and Engineering, Korea University of Technology and Education)

Publication Information

Journal of Practical Engineering Education / v.14, no.1, 2022 , pp. 219-224 More about this Journal

Abstract

Q-Learning is a technique widely used as a basic algorithm for reinforcement learning. Q-Learning trains the agent in the direction of maximizing the reward through the greedy action that selects the largest value among the rewards of the actions that can be taken in the current state. In this paper, we studied a policy that can speed up agent training using Q-Learning in Frozen Lake 8×8 grid environment. In addition, the training results of the existing algorithm of Q-learning and the algorithm that gave the attribute 'direction' to agent movement were compared. As a result, it was analyzed that the Q-Learning policy proposed in this paper can significantly increase both the accuracy and training speed compared to the general algorithm.

Keywords

OpenAI Gym; Q-Learning; Reinforcement Learning; Reward Policy; Training;

Citations & Related Records

Reference

1	V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, "Playing atari with deep reinforcement learning," Proceeding of the 2013 Conference on Neural Information Processing Systems Deep Learning Workshop, California: USA, 2013.
2	X. Wang, L. Jin, and H. Wei, "The shortest path planning based on reinforcement learning," Journal of Physics: Conference Series, vol. 1584, 012006, 2020. DOI
3	R. S. Sutton and A. G. Barto, "Reinforcement learning: an introduction," MIT Press Cambridge, vol. 135, 1998.
4	C. Watkins and P. Dayan, "Q-learning," Machine Learning, vol. 8, pp. 279-292, May 1992.
5	J. Clifton and E. Laber, "Q-learning: theory and applications", Annual Review of Statistics and Its Application, vol. 7, pp. 279-301, 2020. DOI
6	G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, "OpenAI Gym," Jun. 2016, arXiv [Online]. Available: https://arxiv.org/ abs/1606.01540v1.

KSCI

Q-Learning Policy Design to Speed Up Agent Training 에이전트 학습 속도 향상을 위한 Q-Learning 정책 설계

Q-Learning Policy Design to Speed Up Agent Training