제어로봇시스템학회:학술대회논문집
- 제어로봇시스템학회 2000년도 제15차 학술회의논문집
- /
- Pages.489-489
- /
- 2000
Labeling Q-Learning for Maze Problems with Partially Observable States
- Lee, Hae-Yeon (Dept. of Electrical and Communication Engineering, Graduate School of Engineering, Tohoku Univ) ;
- Hiroyuki Kamaya (Dept. of Electrical Engineering, Hachinohe National College of Technology) ;
- Kenich Abe (Dept. of Electrical and Communication Engineering, Graduate School of Engineering, Tohoku Univ)
- 발행 : 2000.10.01
초록
Recently, Reinforcement Learning(RL) methods have been used far teaming problems in Partially Observable Markov Decision Process(POMDP) environments. Conventional RL-methods, however, have limited applicability to POMDP To overcome the partial observability, several algorithms were proposed [5], [7]. The aim of this paper is to extend our previous algorithm for POMDP, called Labeling Q-learning(LQ-learning), which reinforces incomplete information of perception with labeling. Namely, in the LQ-learning, the agent percepts the current states by pair of observation and its label, and the agent can distinguish states, which look as same, more exactly. Labeling is carried out by a hash-like function, which we call Labeling Function(LF). Numerous labeling functions can be considered, but in this paper, we will introduce several labeling functions based on only 2 or 3 immediate past sequential observations. We introduce the basic idea of LQ-learning briefly, apply it to maze problems, simple POMDP environments, and show its availability with empirical results, look better than conventional RL algorithms.
키워드