Comparison of Learning Performance by Reinforcement Learning Agent Visibility Information Difference |
Kim, Chan Sub
(School of Games, Hongik University)
Jang, Si-Hwan (Contents Research Division, ETRI) Yang, Seong-Il (Contents Research Division, ETRI) Kang, Shin Jin (School of Games, Hongik University) |
1 | Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., ... & Silver, D. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782), 350-354. DOI |
2 | Sutton, R. S., McAllester, D. A., Singh, S. P., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems (pp. 1057-1063). |
3 | Unity Engine, https://www.unity.com/ |
4 | Marzian, F., & Qamal, M. (2017). Game RPG "The Royal Sword" Berbasis Desktop Dengan Menggunakan Metode Finite State Machine (FSM). Jurnal Sistem Informasi, 1(2). |
5 | Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., ... & Hassabis, D. (2017). Mastering the game of go without human knowledge. nature, 550(7676), 354-359. DOI |
6 | Sangbin Moon, "Generation of progamer level Bimu AI using reinforcement learning in Blade and Soul", NDC, last modified Jul 24, 2019, accessed May 26, 2021, |
7 | Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009, June). Curriculum learning. In Proceedings of the 26th annual international conference on machine learning (pp. 41-48). |
8 | Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347. |
9 | Soo Yeong Jang, et al. Deep reinforcement learning technology trends, ETRI Electronics and Telecommunications Trends, 34.4 (2019):1-14. |
10 | Pytorch Library, https://pytorch.org/ |
11 | ZeroMQ library, https://zeromq.org/ |
12 | Tensorboard, https://www.tensorflow.org/tensorboard |
13 | Teahoon Kim, "Implementing Cookie Run AI that is better than me with deep learning and reinforcement learning", slideshare, last modified Oct 25, 2016, accessed May 24, 2021, https://www.slideshare.net/carpedm20/ai-67616630. |
14 | Stable Baselines 3, https://github.com/DLR-RM/stable-baselines3 |
15 | Schulman, J., Levine, S., Abbeel, P., Jordan, M., & Moritz, P. (2015, June). Trust region policy optimization. In International conference on machine learning (pp. 1889-1897). PMLR. |