Browse > Article
http://dx.doi.org/10.7583/JKGS.2021.21.5.17

Comparison of Learning Performance by Reinforcement Learning Agent Visibility Information Difference  

Kim, Chan Sub (School of Games, Hongik University)
Jang, Si-Hwan (Contents Research Division, ETRI)
Yang, Seong-Il (Contents Research Division, ETRI)
Kang, Shin Jin (School of Games, Hongik University)
Abstract
Reinforcement learning, in which artificial intelligence develops itself to find the best solution to problems, is a technology that is highly valuable in many fields. In particular, the game field has the advantage of providing a virtual environment for problem-solving to reinforcement learning artificial intelligence, and reinforcement learning agents solve problems about their environment by identifying information about their situation and environment using observations. In this experiment, the instant dungeon environment of the RPG game was simplified and produced and various observation variables related to the field of view were set to the agent. As a result of the experiment, it was possible to figure out how much each set variable affects the learning speed, and these results can be referred to in the study of game RPG reinforcement learning.
Keywords
Reinforcement learning; Proximal policy optimization(PPO); Game agent;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., ... & Silver, D. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782), 350-354.   DOI
2 Sutton, R. S., McAllester, D. A., Singh, S. P., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems (pp. 1057-1063).
3 Unity Engine, https://www.unity.com/
4 Marzian, F., & Qamal, M. (2017). Game RPG "The Royal Sword" Berbasis Desktop Dengan Menggunakan Metode Finite State Machine (FSM). Jurnal Sistem Informasi, 1(2).
5 Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., ... & Hassabis, D. (2017). Mastering the game of go without human knowledge. nature, 550(7676), 354-359.   DOI
6 Sangbin Moon, "Generation of progamer level Bimu AI using reinforcement learning in Blade and Soul", NDC, last modified Jul 24, 2019, accessed May 26, 2021,
7 Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009, June). Curriculum learning. In Proceedings of the 26th annual international conference on machine learning (pp. 41-48).
8 Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
9 Soo Yeong Jang, et al. Deep reinforcement learning technology trends, ETRI Electronics and Telecommunications Trends, 34.4 (2019):1-14.
10 Pytorch Library, https://pytorch.org/
11 ZeroMQ library, https://zeromq.org/
12 Tensorboard, https://www.tensorflow.org/tensorboard
13 Teahoon Kim, "Implementing Cookie Run AI that is better than me with deep learning and reinforcement learning", slideshare, last modified Oct 25, 2016, accessed May 24, 2021, https://www.slideshare.net/carpedm20/ai-67616630.
14 Stable Baselines 3, https://github.com/DLR-RM/stable-baselines3
15 Schulman, J., Levine, S., Abbeel, P., Jordan, M., & Moritz, P. (2015, June). Trust region policy optimization. In International conference on machine learning (pp. 1889-1897). PMLR.