Learning Relational Instance-Based Policies from User Demonstrations

사용자 데모를 이용한 관계적 개체 기반 정책 학습

  • 박찬영 (경기대학교 컴퓨터과학과) ;
  • 김현식 (경기대학교 전자계산학과) ;
  • 김인철 (경기대학교 컴퓨터과학과)
  • Received : 2010.02.04
  • Accepted : 2010.02.23
  • Published : 2010.05.15

Abstract

Demonstration-based learning has the advantage that a user can easily teach his/her robot new task knowledge just by demonstrating directly how to perform the task. However, many previous demonstration-based learning techniques used a kind of attribute-value vector model to represent their state spaces and policies. Due to the limitation of this model, they suffered from both low efficiency of the learning process and low reusability of the learned policy. In this paper, we present a new demonstration-based learning method, in which the relational model is adopted in place of the attribute-value model. Applying the relational instance-based learning to the training examples extracted from the records of the user demonstrations, the method derives a relational instance-based policy which can be easily utilized for other similar tasks in the same domain. A relational policy maps a context, represented as a pair of (state, goal), to a corresponding action to be executed. In this paper, we give a detail explanation of our demonstration-based relational policy learning method, and then analyze the effectiveness of our learning method through some experiments using a robot simulator.

데모-기반 학습은 사용자가 직접 작업을 시연함으로써 로봇에게 쉽게 새로운 작업지식을 가르칠 수 있다는 장점이 있다. 하지만 기존의 많은 데모-기반 학습법들은 상태공간과 정책들을 표현하기 위해 속성-값 벡터 모델을 이용하였다. 속성-값 벡터 모델의 제한성으로 인해, 이들은 학습과정의 효율성도 낮고 학습된 정책의 재사용성도 낮았다. 본 논문에서는 기존의 속성-값 모델 대신 관계적 모델을 이용하는 새로운 데모-기반 작업 학습법을 제안한다. 이 방법에서는 사용자 데모 기록에서 추출한 훈련 예들에 관계적 개체-기반 학습법을 적용함으로써, 동일 작업영역내의 다른 유사한 작업들에도 활용하기 용이한 관계적 개체-기반 정책을 유도한다. 이 관계적 정책은 (상태, 목표) 쌍으로 표현되는 임의의 한 상황에 대해 이것에 대응하는 하나의 실행동작을 결정해주는 역할을 한다. 본 논문에서는 데모-기반 관계적 정책 학습법에 대해 자세히 소개한 후, 로봇 시뮬레이터를 이용한 실험을 통해 이 학습법의 효과를 분석해본다.

Keywords

References

  1. B. D. Argall, S. Chernova, M. Veloso, and B. Browning, "A Survey of Robot Learning from Demonstration," Robotics and Autonomous Systems, vol.57, pp.469-483, 2009. https://doi.org/10.1016/j.robot.2008.10.024
  2. M. Nicolescu and M. Mataric, "Methods for Robot Task Learning: Demonstrations, Generalization and Practice," Proc. of the 2nd International Joint Conference on Autonomous Agents and Multi-Agent Systems, AAMAS'03, pp.241-248, 2003.
  3. E.F. Morales and C. Sammut, "Learning to Fly by Combining Reinforcement Learning with Behavioral Cloning," Proc. of the 21th International Conference on Machine Learning, ICML'04, pp.76-81, 2004.
  4. L. D. Raedt, Logical and Relational Learning, Cognitive Technologies, Springer, Berlin, 2008.
  5. P. Tadepalli, R. Givan, and K. Driessens, "Relational Reinforcement Learning: An Overview," Proc. of the 21th International Conference on Machine Learning, ICML'04, Workshop on Relational Reinforcement Learning, 2004.
  6. J. Saunders, C. L. Nehaniv, and K. Dautenhahn, "Teaching Robots by Moulding Behavior and Scaffolding the Environment," Proc. of the 1st International Conference on Human-Robot Interaction, HRI'06, pp.118-125, 2006.
  7. R. Garcia-Duran, F. Fernandez, and D. Borrajo, "Nearest Prototype Classification for Relational Learning," Proc. of the 16th International Conference on Inductive Logic Programming(ILP-2006), pp.89-91, 2006.
  8. K. Johns and T. Taylor, Professional Microsoft Robotics Developer Studio, Wiley, 2008.
  9. S. Chernova and M. Veloso, "Confidence-Based Policy Learning from Demonstration Using Gaussian Mixture Models," Proc. of the 6th International Joint Conference on Autonomous Agents and Multi-Agent Systems, AAMAS'07, pp.1315-1322, 2007.
  10. H. Veeraraghavan and M. Veloso, "Teaching Sequential Tasks with Repetition through Voice and Vision," Proc. of the 7th International Joint Conference on Autonomous Agents and Multi- Agent Systems, AAMAS'08, pp.518-527, 2008.
  11. E. Winner and M. Veloso, "LoopDISTILL: Learning Domain-Specific Planners from Example Plans," Proc. of the International Conference on Automated Planning and Scheduling, ICAPS'07 Workshop on AI Planning and Learning, 2007.