Search | Korea Science

Min Kyong Kim;Beom Seuk Hwang
- The Korean Journal of Applied Statistics
- /
- v.37 no.5
- /
- pp.663-673
- /
- 2024
The multi-armed bandits (MAB) problem, involves selecting actions to maximize rewards within dynamic environments. This study explores the application of Thompson sampling, a robust MAB algorithm, within the context of big data analytics and statistical learning theory. By leveraging large-scale banner click data from recommendation systems, we evaluate Thompson sampling's performance across various simulated scenarios, employing advanced approximation techniques. Our findings demonstrate that Thompson sampling, particularly with Langevin Monte Carlo approximation, maintains robust performance and scalability in big data environments. This underscores its practical significance and adaptability, aligning with contemporary challenges in statistical learning.
https://doi.org/10.5351/KJAS.2024.37.5.663 인용 PDF

Lee, Byung-Doo;Park, Dong-Soo;Choi, Young-Wook
- Journal of Korea Game Society
- /
- v.15 no.5
- /
- pp.109-118
- /
- 2015
The game of Go originated from ancient China is regarded as one of the most difficult challenges in the filed of AI. Over the past few years, the top computer Go programs based on MCTS have surprisingly beaten professional players with handicap. MCTS is an approach that simulates a random sequence of legal moves until the game is ended, and replaced the traditional knowledge-based approach. We applied the UCT algorithm which is a MCTS variant to the game of Tic-Tac-Toe for finding the best first move, and compared it with the result generated by a pure MCTS. Furthermore, we introduced and compared the performances of epsilon-Greedy algorithm and UCB algorithm for solving the Multi-Armed Bandit problem to understand the UCB.
https://doi.org/10.7583/JKGS.2015.15.5.109 인용 PDF KSCI