DOI QR코드

DOI QR Code

The UCT algorithm applied to find the best first move in the game of Tic-Tac-Toe

삼목 게임에서 최상의 첫 수를 구하기 위해 적용된 신뢰상한트리 알고리즘

  • Received : 2015.09.09
  • Accepted : 2015.10.16
  • Published : 2015.10.20

Abstract

The game of Go originated from ancient China is regarded as one of the most difficult challenges in the filed of AI. Over the past few years, the top computer Go programs based on MCTS have surprisingly beaten professional players with handicap. MCTS is an approach that simulates a random sequence of legal moves until the game is ended, and replaced the traditional knowledge-based approach. We applied the UCT algorithm which is a MCTS variant to the game of Tic-Tac-Toe for finding the best first move, and compared it with the result generated by a pure MCTS. Furthermore, we introduced and compared the performances of epsilon-Greedy algorithm and UCB algorithm for solving the Multi-Armed Bandit problem to understand the UCB.

고대 중국에서 기원된 바둑은 인공지능 분야에서 가장 어려운 도전 중의 하나로 간주된다. 지난 수년에 걸쳐 MCTS를 기반으로 하는 정상급 컴퓨터바둑 프로그램이 놀랍게도 접바둑에서 프로기사를 물리쳤다. MCTS는 게임이 끝날 때까지 일련의 무작위 유효착수를 시뮬레이션 하는 접근법이며, 기존의 지식기반 접근법을 대체했다. 저자는 MCTS의 변형인 UCT 알고리즘을 삼목 게임에 적용하여 최선의 첫 수를 찾고자 했으며, 순수 MCTS의 결과와 비교를 했다. 아울러 UCB 이해를 위한 다중슬롯머신 문제를 풀기 위해 엡실론-탐욕 알고리즘과 UCB 알고리즘을 소개 및 성능을 비교하였다.

Keywords

References

  1. B.D. Lee, "Analysis of Tic-Tac-Toe Game Strategies using Genetic Algorithm", Journal of Korea Game Society, Vol. 14, No. 6, pp. 39-48, 2014. https://doi.org/10.7583/JKGS.2014.14.6.39
  2. B.D Lee, "Monte-Carlo Tree Search Applied to the game of Tic-Tac-Toe", Journal of Korea Game Society, Vol. 14, No. 3, pp. 47-54, 2014. https://doi.org/10.7583/JKGS.2014.14.3.47
  3. B.D. Lee and J.W. Park, "Applying Principal Component Analysis to Go Openings", Journal of Korea Game Society, Vol. 13, No. 2, pp. 59-70, 2013. https://doi.org/10.7583/JKGS.2013.13.2.59
  4. B.D. Lee, "Evolutionary neural network model for recognizing strategic fitness of a finished Tic-Tac-Toe game", Journal of Korean Society for Computer Game, Vol. 28, No. 2, pp. 95-101, 2015.
  5. B.D. Lee, "Comparison of LDA and PCA for Korean Pro Go Player's Opening Recognition", Journal of Korea Game Society, Vol. 13, No. 4, pp. 15-24, 2013. https://doi.org/10.7583/JKGS.2013.13.4.15
  6. B.D. Lee and Y.W. Choi, "The best move sequence in playing Tic-Tac-Toe game", Journal of The Korean Society for Computer Game, Vol. 27, No. 3, pp. 11-16, 2014.
  7. B.D. Lee, "Analysis of Korean, Chinese and Japanese Pro Go Player's Openings", Journal of Korean Society for Computer Game, Vol. 26, No. 4, pp. 17-26, 2013.
  8. B.D. Lee, "Korean Pro Go Player's Opening Recognition Using PCA", Journal of Korean Society for Computer Game, Vol. 26, No. 2, pp. 228-233, 2013.
  9. S. Gelly, M. Schoenauer, M. Sebag, O. Teytaud, L. Kocsis, D. Silver and C. Szepesvari, "The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions", Communications of the ACM, Vol. 55, No. 3, pp. 106-113, 2012. https://doi.org/10.1145/2093548.2093574
  10. S. Gelly and D. Silver, "Monte-Carlo Tree Search and Rapid Action Value Estimation in Computer Go", Artificial Intelligence, Vol. 75, Issue 11, pp. 1856-1875, 2011.
  11. Wikipedia, "Computer Go", from http://en.wikipedia.org/wiki/Computer_Go, 2015.
  12. G. Chaslot, "Monte-Carlo Tree Search", Ph.D. dissertation, University of Masstricht, 2010.
  13. Wikipedia, "Tic-Tac-Toe", from http://en.wikipedia.org/wiki/Tic-Tac-Toe, 2015.
  14. A.A.J van der Kleij, "Monte Carlo Tree Search and Opponent Modeling through Player Clustering in no-limit Texas Hold'en Poker", Master thesis, University of Groningen, 2010.
  15. H. Baier and M.H.M. Winands, "Monte-Carlo Tree Search and Minimax Hybrids", Computer Games, Vol. 504, pp. 45-63, 2014. https://doi.org/10.1007/978-3-319-14923-3_4
  16. G. Hochmuth, "On the Genetic Evolution of a Perfect Tic-Tac-Toe Strategy", from http://www.genetic-programming.org/sp2003/Hochmuth.pdf, 2015.
  17. N. Sephton, P.I. Cowling, E. Powley and N.H. Slaven, "Heuristic Move Pruning in Monte Carlo Tree Search for the Strategic Card Game Lords of War", In Computational Intelligence and Games (CIG) of IEEE, pp. 1-7, 2014.
  18. T. Pepels, "Novel Selection Methods for Monte-Carlo Tree Search", Master thesis, University of Masstricht, 2014.
  19. D. Brand and S. Kroon, "Sample Evaluation for Action Selection in Monte Carlo Tree Search", from http://dl.acm.org/citation.cfm?doid=2664591.2664612, 2015.
  20. Y. Wang and S. Gelly, "Modification of UCT and sequence-like simulations for Monte-Carlo Go", from http://dept.stat.lsa.umich.edu/-yizwang/publications/wang07modifications.pdf, 2015.
  21. J.M. White, "Bandit Algorithms for Website Optimization", O'Relly, 2013.
  22. L. Lew, "Modeling Go Game as a Large Decomposable Decision Process", Ph.D. thesis, Warsaw University, 2011.
  23. P. Auer, N. Cesa-Bianchi and P. Fisher, "Finite-time Analysis of the Multiarmed Bandit Problem", Kluwer Academic Publishers, 2002.
  24. S. Takeuchi, T. Kanoke and K. Yamaguchi, "Evaluation of Monte Carlo Tree Search and the Application of Go", from http://www.csse.uwa.edu.au/cig08/Proceedings/papers/8046.pdf, 2015.
  25. I.J. Ahn and I.K. Park, "Design of Omok AI using Genetic Algorithm and Game Trees and Their Parallel Processing on the CPU", Journal of the Korea Information Science Society, Vol. 37, No. 2, pp. 66-75, 2010.
  26. A. Bhatt, P. Varshney and K. Deb, "In Search of No-loss Strategies for the Game of Tic-Tac-Toe using a Customized Genetic Algorithm", GECCO'08(Genetic and Evolutionary Computation Conference 2008, pp. 889-896, 2008.