DOI QR코드

DOI QR Code

Automatic Adaptive Space Segmentation for Reinforcement Learning

  • Komori, Yuki (Department of Computer Sciences and Intelligent Systems, Graduate School of Engineering, Osaka Prefecture University) ;
  • Notsu, Akira (Department of Computer Sciences and Intelligent Systems, Graduate School of Engineering, Osaka Prefecture University) ;
  • Honda, Katsuhiro (Department of Computer Sciences and Intelligent Systems, Graduate School of Engineering, Osaka Prefecture University) ;
  • Ichihashi, Hidetomo (Department of Computer Sciences and Intelligent Systems, Graduate School of Engineering, Osaka Prefecture University)
  • 투고 : 2012.02.28
  • 심사 : 2012.03.12
  • 발행 : 2012.03.25

초록

We tested a single pendulum simulation and observed the influence of several situation space segmentation types in reinforcement learning processes in order to propose a new adaptive automation for situation space segmentation. Its segmentation is performed by the Contraction Algorithm and the Cell Division Approach. Also, its automation is performed by "entropy," which is defined on action values’ distributions. Simulation results were shown to demonstrate the influence and adaptability of the proposed method.

키워드

참고문헌

  1. R. S. Sutton, "Learning to Predict by Method of Temporal Differences," Machine Learning, vol. 3, 1, pp. 9-44, 1988.
  2. T. Jaakkola, M. Jordan, and S. P. Singh, " On the Convergence of Stochastic Iterative Dynamic Programming Algorithms," Neural Computation, vol. 6, pp. 341-362, 1992.
  3. C. J. C. H. Watkins and P. Dayan, "Technical Note: QLearning," Machine Learning, vol. 8, pp. 56-68, 1992.
  4. Y. Kashimura, A. Ueno, and S. Tatsumi, "A Continuous Action Space Representation by Particle Filter for Reinforcement Learning," JSAI2008, pp. 118-121, 2008.
  5. A. Notsu, H. Honda, H. Ichihashi, and H. Wada, "Contraction Algorithm in State and Action space for Qlearning," Proc. of SCIS&ISIS, pp. 93-96, 2009.
  6. A. Notsu, H. Wada, H. Honda, and H. Ichihashi, "Cell Division Approach for Search Space in Reinforcement Learning," International Journal of Computer Science and Network Security, vol. 8, no. 6, 2008.
  7. A. Ito and M. Kanabuchi, "Speeding up Multi-Agent Reinforcement Learning by Coarse-Graining of Perception Hunter Game as an Example," IEICE Trans. D, vol. J84-D1, no. 3, pp. 285-293, 2001.
  8. M. Nagayoshi, H. Muraoand, and H. Tamaki, "Switching Reinforcement Learning to Mimic an Infant's Motor Development Application to Two-dimensional Continuous Action Space," Proc. SICE Annual Conference 2010 (SICE 2010), pp.243-246 (TA09-3(on DVD-ROM)), 2010.