DOI QR코드

DOI QR Code

Design and Implementation of a Behavior-Based Control and Learning Architecture for Mobile Robots

이동 로봇을 위한 행위 기반 제어 및 학습 구조의 설계와 구현

  • Published : 2003.07.01

Abstract

A behavior-based control and learning architecture is proposed, where reinforcement learning is applied to learn proper associations between stimulus and response by using two types of memory called as short Term Memory and Long Term Memory. In particular, to solve delayed-reward problem, a knowledge-propagation (KP) method is proposed, where well-designed or well-trained S-R(stimulus-response) associations for low-level sensors are utilized to learn new S-R associations for high-level sensors, in case that those S-R associations require the same objective such as obstacle avoidance. To show the validity of our proposed KP method, comparative experiments are performed for the cases that (ⅰ) only a delayed reward is used, (ⅱ) some of S-R pairs are preprogrammed, (ⅲ) immediate reward is possible, and (ⅳ) the proposed KP method is applied.

Keywords

References

  1. D. B. Folgel, Envolutionary Computation, The IEEE Press, 2000
  2. J. K. George and B. Yuan, Fuzzy Sets and Fuzzy Logic Theory and Applications, Prentice Hall, 1995
  3. S. Maykin, Neural Networks: A Comprehensive Foundation, Prentice Hall, 1999
  4. R. C. Arkin, Behavior-Based Robotics, The MIT Press, Cambridge, 1998
  5. R. R. Murphy, Introduction to AI Robotics, The MIT Press, Cambridge, 2000
  6. R. C. Arkin, 'Towards cosmopolitan robots: Intelligent navigation in extended man-made environments,' Ph.D. Dissertation, COINS Tech, Rpt., 97-80, Univ. of Massachusetts, Dept. of Computer and Information Science, pp. 143-177, 1987
  7. R. A. Brooks, 'A robust layered control system for a mobile robot,' IEEE J. Robotics and Automation, vol. RA-2, no. 1, pp. 14-23, 1986
  8. S. D. Touretzky, and L. M. Saksida, 'Skinnerbots,' Proceedings of The Fourth International Conference on Simulation of Adaptive Behavior(SAB96), pp. 285-294, 1996
  9. B. Blumberg, 'Old Tricks, New Dogs: Ethology and Interactive Creatures,' The Media Lab, MIT, Cambridge, Ph.D. Dissertation, 1996
  10. S. Y. Yoon, 'Affective Synthetic Characters,' The Media Lab, MIT, Cambridge, Ph.D. Dissertation, 2000
  11. J. Pauls, 'Pigs and People,' Project Report, Division of Information, University of Edinburgh, 2001
  12. P. Maes, 'The dynamics of action selection,' Proceedings of International Joint Conference On Artificial Intelligence, Detroit, MI, pp. 991-997, 1989
  13. A. Saffiotti, K. Konolige, and E. Ruspini, 'A multivalued logic approach to integrating planning and control,' Artifical Intelligence 76, pp. 481-526, 1995 https://doi.org/10.1016/0004-3702(94)00088-I
  14. A. F. R. Araujo and A. P. S. Braga, 'Reward-Penalty Reinforcement Learning Schema for Planning and Reactive Behavior,' Proceedings of IEEE International Conference on System, Man, and Cybernetics, vol. 2, pp. 1485-1490, 1998 https://doi.org/10.1109/ICSMC.1998.728095
  15. R. Genov, S. Madhavapeddi, and G. Cauwengerghs, 'Learning to Navigate from Limited Sensory Input: Experiments with the Khepera Microrobot,' Proceedings of International Conference on Neural Networks, vol. 3, pp. 2061-2064, 1999 https://doi.org/10.1109/IJCNN.1999.832703
  16. M. Dorigo, and M. Colombetti, Robot Shaping: An Experiment in Behavior Engineering, The MIT Press, Cambridge, 1998
  17. C.J.C.H. Watkins, 'Learning from delayed rewards,' Ph.d. Thesis, Cambridge University, Cambridge, England, 1989
  18. K. Lorenz, 'The comparative method of studying innate behavior patterns,' Symposia of the Society for Experimental Biology, 4, pp. 221-268, 1950
  19. P. Maes, 'How to do the right thing,' Connection Science, 1, 291-323, 1989 https://doi.org/10.1080/09540098908915643
  20. A. Ludlow, 'The evolution and simulation of a decision maker,' In: Analysis of Motivational Process, Academic Press, 1980
  21. I. P. Pavlov, Selected works, Foreign languages Publishing House, Moscow, 1950
  22. B. F. Skinner, The behavior of organisms: An experimental analysis, Englewood Cliffs, NJ: Prentice Hall, 1938
  23. ActivMedia, AmigoBot User's Guide, ActivMedia Robotics, 2000
  24. R. C. Arkin, and J. Diaz, 'Line-of sight constrained exploration for reactive multiagent robotic teams,' 7th International Workshop on Advanced Motion Control, pp. 455-461, 2002 https://doi.org/10.1109/AMC.2002.1026963
  25. M. Likhachev, M. Kaess and R. C. Arkin, 'Learning behavioral parameterization using spatio-temporal casebased reasoning,' Proceedings of International Conference on Robotics and Automation, vol. 2, pp. 1282-1289, 2002 https://doi.org/10.1109/ROBOT.2002.1014719
  26. J. B. Lee, M. Likhachev and R. C. Arkin, 'Selection of behavioral parameters: integration of discontinuous switching via case-based reasoning with continuous adapation via learning momentum,' Proceedings of International Conference on Robotics and Automation, vol. 2, pp. 1275-1281, 2002 https://doi.org/10.1109/ROBOT.2002.1014718
  27. M. Likhachev, and R. C. Arkin, 'Spatio-temporal case-based reasoning for behavioral selection,' Proceedings of International Conference on Robotics and Automation, vol. 2, pp. 1627-1634, 2001 https://doi.org/10.1109/ROBOT.2001.932844
  28. J. B. Lee, and R. C. Arkin, 'Learning momentum: integration and experimentation,' Proceedings of International Conference on Robotics and Automation, vol. 2, pp. 1975-1980, 2001 https://doi.org/10.1109/ROBOT.2001.932897
  29. Y. Endo, and R. C. Arkin 'Implementing Tolman's schematic sowbug: behavior-based robotics in the 1930's,' Proceedings of International Conference on Robotics and Automation, vol. 1, pp. 477-484, 2001 https://doi.org/10.1109/ROBOT.2001.932596