대한전기학회:학술대회논문집 (Proceedings of the KIEE Conference)
- 대한전기학회 2005년도 제36회 하계학술대회 논문집 D
- /
- Pages.2933-2935
- /
- 2005
IRPO 기반 Actor-Critic 학습 기법을 이용한 로봇이동
Robot locomotion via IRPO based Actor-Critic Learning Method
- Kim, Jong-Ho (Dept. of Control & Instrumentation Engineering, Korea University) ;
-
Kang, Dae-Sung
(Dept. of Control & Instrumentation Engineering, Korea University) ;
-
Park, Joo-Young
(Dept. of Control & Instrumentation Engineering, Korea University)
- 발행 : 2005.07.18
초록
The IRPO(Intensive Randomized Policy Optimizer) algorithm is a recently developed tool in the area of reinforcement leaming. And it has been shown to be very successful in several application problems. To compare with a general RL method, IRPO has some difference in that policy utilizes the entire history of agent -environment interaction. The policy is derived from the history directly, not through any kind of a model of the environment. In this paper, we consider a robot-control problem utilizing a IRPO algorithm. We also developed a MATLAH-based animation program, by which the effectiveness of the training algorithms were observed.
키워드