Path Planning for a Robot Manipulator based on Probabilistic Roadmap and Reinforcement Learning

  • Park, Jung-Jun (Department of Mechanical Engineering, Korea University) ;
  • Kim, Ji-Hun (Department of Mechanical Engineering, Korea University) ;
  • Song, Jae-Bok (Department of Mechanical Engineering, Korea University)
  • Published : 2007.12.31

Abstract

The probabilistic roadmap (PRM) method, which is a popular path planning scheme, for a manipulator, can find a collision-free path by connecting the start and goal poses through a roadmap constructed by drawing random nodes in the free configuration space. PRM exhibits robust performance for static environments, but its performance is poor for dynamic environments. On the other hand, reinforcement learning, a behavior-based control technique, can deal with uncertainties in the environment. The reinforcement learning agent can establish a policy that maximizes the sum of rewards by selecting the optimal actions in any state through iterative interactions with the environment. In this paper, we propose efficient real-time path planning by combining PRM and reinforcement learning to deal with uncertain dynamic environments and similar environments. A series of experiments demonstrate that the proposed hybrid path planner can generate a collision-free path even for dynamic environments in which objects block the pre-planned global path. It is also shown that the hybrid path planner can adapt to the similar, previously learned environments without significant additional learning.

Keywords

References

  1. P. J. McKerrow, Robotics, Addison Wesley, pp. 507-515, 1992
  2. S. F. M. Assal, K. Watanabe, and K. Izumi, 'Fuzzy hint acquisition for the collision avoidance solution of redundant manipulators using neural network,' International Journal of Control, Automation, and Systems, vol. 4, no. 1, pp. 17-29, 2006
  3. J. C. Latombe, Robot Motion Planning, Kluwer Academic Publishers, 1993
  4. R. Al-Hmouz, T. Gulrez, A. Al-Jumaily, 'Probabilistic road maps with obstacle avoidance in cluttered dynamic environment,' Proc. of Intelligent sensor, Sensor Networks and Information Processing Conference, pp. 241-245, 2004
  5. L. E. Kavraki, P. Svestka, J. C. Latombe, and M. H. Overmars, 'Probabilistic roadmaps for path planning in high-dimensional configuration spaces,' IEEE Trans. on Robotics and Automation, vol. 12, no. 4, pp. 566-580, 1996 https://doi.org/10.1109/70.508439
  6. M. L. Minsky, Theory of Neural - Analog Reinforcement System and Application to the Brain - Model Problem, Ph.D. Thesis, Princeton Univ., 1954
  7. A. G. Barto, D. A. White, and D. A. Sofge, 'Reinforcement learning and adaptive critic methods,' Handbook of Intelligent Control, pp. 469-491, 1992
  8. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, 1998
  9. L. P. Kaelbling, M. L. Littman, and A. W. Moore, 'Reinforcement learning: A survey,' Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996