Browse > Article
http://dx.doi.org/10.5302/J.ICROS.2003.9.7.527

Design and Implementation of a Behavior-Based Control and Learning Architecture for Mobile Robots  

서일홍 (한양대학교)
이상훈 (한양대학교)
김봉오 (한양대학교)
Publication Information
Journal of Institute of Control, Robotics and Systems / v.9, no.7, 2003 , pp. 527-535 More about this Journal
Abstract
A behavior-based control and learning architecture is proposed, where reinforcement learning is applied to learn proper associations between stimulus and response by using two types of memory called as short Term Memory and Long Term Memory. In particular, to solve delayed-reward problem, a knowledge-propagation (KP) method is proposed, where well-designed or well-trained S-R(stimulus-response) associations for low-level sensors are utilized to learn new S-R associations for high-level sensors, in case that those S-R associations require the same objective such as obstacle avoidance. To show the validity of our proposed KP method, comparative experiments are performed for the cases that (ⅰ) only a delayed reward is used, (ⅱ) some of S-R pairs are preprogrammed, (ⅲ) immediate reward is possible, and (ⅳ) the proposed KP method is applied.
Keywords
behavior-based; reinforcement learning; delayed reward; knowledge propagation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 ActivMedia, AmigoBot User's Guide, ActivMedia Robotics, 2000
2 R. Genov, S. Madhavapeddi, and G. Cauwengerghs, 'Learning to Navigate from Limited Sensory Input: Experiments with the Khepera Microrobot,' Proceedings of International Conference on Neural Networks, vol. 3, pp. 2061-2064, 1999   DOI
3 M. Dorigo, and M. Colombetti, Robot Shaping: An Experiment in Behavior Engineering, The MIT Press, Cambridge, 1998
4 C.J.C.H. Watkins, 'Learning from delayed rewards,' Ph.d. Thesis, Cambridge University, Cambridge, England, 1989
5 P. Maes, 'How to do the right thing,' Connection Science, 1, 291-323, 1989   DOI   ScienceOn
6 R. C. Arkin, and J. Diaz, 'Line-of sight constrained exploration for reactive multiagent robotic teams,' 7th International Workshop on Advanced Motion Control, pp. 455-461, 2002   DOI
7 K. Lorenz, 'The comparative method of studying innate behavior patterns,' Symposia of the Society for Experimental Biology, 4, pp. 221-268, 1950
8 B. F. Skinner, The behavior of organisms: An experimental analysis, Englewood Cliffs, NJ: Prentice Hall, 1938
9 A. Ludlow, 'The evolution and simulation of a decision maker,' In: Analysis of Motivational Process, Academic Press, 1980
10 I. P. Pavlov, Selected works, Foreign languages Publishing House, Moscow, 1950
11 M. Likhachev, and R. C. Arkin, 'Spatio-temporal case-based reasoning for behavioral selection,' Proceedings of International Conference on Robotics and Automation, vol. 2, pp. 1627-1634, 2001   DOI
12 S. Maykin, Neural Networks: A Comprehensive Foundation, Prentice Hall, 1999
13 M. Likhachev, M. Kaess and R. C. Arkin, 'Learning behavioral parameterization using spatio-temporal casebased reasoning,' Proceedings of International Conference on Robotics and Automation, vol. 2, pp. 1282-1289, 2002   DOI
14 Y. Endo, and R. C. Arkin 'Implementing Tolman's schematic sowbug: behavior-based robotics in the 1930's,' Proceedings of International Conference on Robotics and Automation, vol. 1, pp. 477-484, 2001   DOI
15 J. B. Lee, M. Likhachev and R. C. Arkin, 'Selection of behavioral parameters: integration of discontinuous switching via case-based reasoning with continuous adapation via learning momentum,' Proceedings of International Conference on Robotics and Automation, vol. 2, pp. 1275-1281, 2002   DOI
16 J. B. Lee, and R. C. Arkin, 'Learning momentum: integration and experimentation,' Proceedings of International Conference on Robotics and Automation, vol. 2, pp. 1975-1980, 2001   DOI
17 D. B. Folgel, Envolutionary Computation, The IEEE Press, 2000
18 R. C. Arkin, Behavior-Based Robotics, The MIT Press, Cambridge, 1998
19 R. R. Murphy, Introduction to AI Robotics, The MIT Press, Cambridge, 2000
20 J. K. George and B. Yuan, Fuzzy Sets and Fuzzy Logic Theory and Applications, Prentice Hall, 1995
21 R. C. Arkin, 'Towards cosmopolitan robots: Intelligent navigation in extended man-made environments,' Ph.D. Dissertation, COINS Tech, Rpt., 97-80, Univ. of Massachusetts, Dept. of Computer and Information Science, pp. 143-177, 1987
22 S. D. Touretzky, and L. M. Saksida, 'Skinnerbots,' Proceedings of The Fourth International Conference on Simulation of Adaptive Behavior(SAB96), pp. 285-294, 1996
23 S. Y. Yoon, 'Affective Synthetic Characters,' The Media Lab, MIT, Cambridge, Ph.D. Dissertation, 2000
24 J. Pauls, 'Pigs and People,' Project Report, Division of Information, University of Edinburgh, 2001
25 R. A. Brooks, 'A robust layered control system for a mobile robot,' IEEE J. Robotics and Automation, vol. RA-2, no. 1, pp. 14-23, 1986
26 P. Maes, 'The dynamics of action selection,' Proceedings of International Joint Conference On Artificial Intelligence, Detroit, MI, pp. 991-997, 1989
27 B. Blumberg, 'Old Tricks, New Dogs: Ethology and Interactive Creatures,' The Media Lab, MIT, Cambridge, Ph.D. Dissertation, 1996
28 A. F. R. Araujo and A. P. S. Braga, 'Reward-Penalty Reinforcement Learning Schema for Planning and Reactive Behavior,' Proceedings of IEEE International Conference on System, Man, and Cybernetics, vol. 2, pp. 1485-1490, 1998   DOI
29 A. Saffiotti, K. Konolige, and E. Ruspini, 'A multivalued logic approach to integrating planning and control,' Artifical Intelligence 76, pp. 481-526, 1995   DOI   ScienceOn