DOI QR코드

DOI QR Code

Hyper-parameter Optimization for Monte Carlo Tree Search using Self-play

  • Lee, Jin-Seon (Department of Information and Security, Woosuk University) ;
  • Oh, Il-Seok (Division of Computer Science and Engineering, Jeonbuk National University)
  • Received : 2020.09.10
  • Accepted : 2020.10.25
  • Published : 2020.12.31

Abstract

The Monte Carlo tree search (MCTS) is a popular method for implementing an intelligent game program. It has several hyper-parameters that require an optimization for showing the best performance. Due to the stochastic nature of the MCTS, the hyper-parameter optimization is difficult to solve. This paper uses the self-playing capability of the MCTS-based game program for optimizing the hyper-parameters. It seeks a winner path over the hyper-parameter space while performing the self-play. The top-q longest winners in the winner path compete for the final winner. The experiment using the 15-15-5 game (Omok in Korean name) showed a promising result.

Keywords

References

  1. GyuHyeok Choi and Mijin Kim, "Analysis of Players' Eye-Movement Patterns by Playing Experience in FPS Game," Smart Media Journal, vol. 5, no. 2, pp.33-41, 2016
  2. Walter, J.C. and Barkema, G.T., "An introduction to Monte Carlo methods," Physica A:Statistical Mechanics and its Applications, vol. 418, pp. 78-87, January 2015 https://doi.org/10.1016/j.physa.2014.06.014
  3. Browne, c., et al., "A survey of Monte Carlo tree search methods," IEEE Transactions on Computational Intelligence and AI in Games, vol. 4 Issue: 1, pp. 1-43, March 2012 https://doi.org/10.1109/TCIAIG.2012.2186810
  4. Fu, M.C., "Monte Carlo tree search: a tutorial," Proceedings of 2018 Winter Simulation Conference, pp. 222-236, 2018
  5. Brugmann, B., "Monte Carlo Go, Technical report," Department of Physics, Syracuse University, 1993
  6. Coulom, R., "Efficient selectivity and backup operators in Monte-Carlo tree search," 5th International Conference on Computers and Games, pp. 72-83, 2006
  7. Silver, D., et al., "Mastering the game of Go with deep neural networks and tree search," Nature 529, pp. 484-489, January 2016 https://doi.org/10.1038/nature16961
  8. Silver, D., et al., "Mastering the game of Go without human knowledge," Nature 550, pp. 354-359, 2017 https://doi.org/10.1038/nature24270
  9. Poole, D.L. and Mackworth, A.K., "Artificial Intelligence: Foundations of Computational Agents," Cambridge University Press, 2017
  10. Goodfellow, I., Bengio, Y. and Courville, A., "Deep Learning", The MIT Press, 2016
  11. Bergstra, J.S., Bardenet, R., "Bengio, Y. and Kegl, B., Algorithms for hyper-parameter optimization," Advances in neural information processing systems, pp. 2546-2554, 2011
  12. Wang, H, Emmerich, M., Preuss and M., Plaat, A., "Analysis of hyper-parameter for small games: iterations or epochs in self-play?," arXiv:2003.05988v1, 2020
  13. Ruijl, B., Vermaseren, J., Plaat, A. and Herik, J., "Combining simulated annealing and Monte Carlo Tree Search for expression simplification," Proceedings of the 6th International Conference on Agents and Artificial Intelligence, pp. 724-731, 2014
  14. Woo-Jin Joe, Hyo-Jeong Shin and Hyong-Shik Kim, "A log visualization method for network security monitoring," Smart Media Journal, vol. 7, no. 4, pp. 70-78, 2018
  15. Dasom Seo, KangHan Oh, Il-Seok Oh and Tae-Woong Yoo, "Superpixel Exclusion-Inclusion Multiscale Approach for Explanations of Deep Learning," Smart Media Journal, vol. 8, no. 2, pp. 39-45, 2019
  16. Li, L., et al., "Hyperband: a novel bandit-based approach to hyperparameter optimization," The Journal of Machine Learning Research, vol. 18, no. 1, pp. 1-52, January 2017
  17. Rakotoarison, H., Schoenauer, M. and Sebag, M., "Automated machine learning with Monte Carlo tree search," IJCAI-19 28th International Joint Conference on Artificial Intelligence, pp. 3296-3303, Macau, China, Aug. 2019