Robot locomotion via IRPO based Actor-Critic Learning Method

IRPO 기반 Actor-Critic 학습 기법을 이용한 로봇이동

  • Kim, Jong-Ho (Dept. of Control & Instrumentation Engineering, Korea University) ;
  • Kang, Dae-Sung (Dept. of Control & Instrumentation Engineering, Korea University) ;
  • Park, Joo-Young (Dept. of Control & Instrumentation Engineering, Korea University)
  • 김종호 (고려대학교 제어계측공학과) ;
  • 강대성 (고려대학교 제어계측공학과) ;
  • 박주영 (고려대학교 제어계측공학과)
  • Published : 2005.07.18

Abstract

The IRPO(Intensive Randomized Policy Optimizer) algorithm is a recently developed tool in the area of reinforcement leaming. And it has been shown to be very successful in several application problems. To compare with a general RL method, IRPO has some difference in that policy utilizes the entire history of agent -environment interaction. The policy is derived from the history directly, not through any kind of a model of the environment. In this paper, we consider a robot-control problem utilizing a IRPO algorithm. We also developed a MATLAH-based animation program, by which the effectiveness of the training algorithms were observed.

Keywords