Browse > Article
http://dx.doi.org/10.5391/JKIIS.2005.15.4.406

Area-Based Q-learning Algorithm to Search Target Object of Multiple Robots  

Yoon, Han-Ul (중앙대학교 전자전기공학부)
Sim, Kwee-Bo (중앙대학교 전자전기공학부)
Publication Information
Journal of the Korean Institute of Intelligent Systems / v.15, no.4, 2005 , pp. 406-411 More about this Journal
Abstract
In this paper, we present the area-based Q-learning to search a target object using multiple robot. To search the target in Markovian space, the robots should recognize their surrounding at where they are located and generate some rules to act upon by themselves. Under area-based Q-learning, a robot, first of all, obtains 6-distances from itself to environment by infrared sensor which are hexagonally allocated around itself. Second, it calculates 6-areas with those distances then take an action, i.e., turn and move toward where the widest space will be guaranteed. After the action is taken, the value of Q will be updated by relative formula at the state. We set up an experimental environment with five small mobile robots, obstacles, and a target object, and tried to search for a target object while navigating in a unknown hallway where some obstacles were placed. In the end of this paper, we presents the results of three algorithms - a random search, area-based action making (ABAM), and hexagonal area-based Q-teaming.
Keywords
area-based action making; Q-learning; hexagonal area-based Q-learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 L. Parker, 'Adaptive action selection for cooperative agent teams,' Proc. of 2nd Int. Conf. on Simulation of Adaptive Behavior, pp. 442-450, 1992
2 D. Ballard, An Introduction to Natural Computation, The MIT Press Cambtidge, 1997
3 H-U. Yoon, S-H. Whang, D-W. Kim, and K-B Sim, 'Strategy of cooperative behaviors of distributed autonomous robotic systems,' Proc. of 10th Int. Symp. on Artificial Life and Robotics, pp. 151-154, 2005
4 W. Ashley, T. Balch, 'Value-based observation with robot teams (VBORT) using probabilistic techniques,' Proc. of Int. Conf. on Advanced Robotics, 2003
5 H-U. Yoon and K-B. Sim, 'Hexagon-Based Q-Learning for Object Search with Multiple Robots,' Lecture Notes in Computer Science (LNCS) published in Springer, vol. 3612, pp. 713-222, 2005
6 T. Mitchell, Machine Learning, McGraw-Hill Singapore, 1997
7 G. Ogasawara, T. Omata, and T. Sato, 'Multiple movers using distributed, decision-theoretic control,' Proc. of Japan-USA Symp. on Flexible Automation, vol. 1, pp. 623-630, 1992
8 W. Ashley, T. Balch, 'Value-based observation with robot teams (VBORT) for dynamic targets,' Proc. of Int. Conf. on Intelligent Robots and Systems, 2003
9 D. Patterson and J. Hennessy, Computer Organization and Design, Morgan-Kaufmann Korea, 2005
10 J. Jang, C. Sun, and E. Mizutani, Neuro-Fuzzy Soft Computing, Prentice-Hall New Jersey, 1997
11 C. Clausen and H. Wdchsler, 'Quad-Q-learning,' IEEE Trans. on Neural Network, vol. 11, pp. 279-294, 2000   DOI   ScienceOn