DOI QR코드

DOI QR Code

Q-learning을 이용한 이동 로봇의 실시간 경로 계획

Real-Time Path Planning for Mobile Robots Using Q-Learning

  • Kim, Ho-Won (Dept. of Smart Robot Convergence and Application Engineering, Pukyong National University) ;
  • Lee, Won-Chang (Dept. of Electronic Engineering, Pukyong National University)
  • 투고 : 2020.11.26
  • 심사 : 2020.12.18
  • 발행 : 2020.12.31

초록

강화학습은 주로 순차적인 의사 결정 문제에 적용되어 왔다. 특히 최근에는 신경망과 결합한 형태로 기존에는 해결하지 못한 분야에서도 성공적인 결과를 내고 있다. 하지만 신경망을 이용하는 강화학습은 현장에서 즉각적으로 사용하기엔 너무 복잡하다는 단점이 있다. 본 논문에서는 학습이 쉬운 강화학습 알고리즘 중 하나인 Q-learning을 이용하여 이동 로봇의 경로를 생성하는 알고리즘을 구현하였다. Q-table을 미리 만드는 방식의 Q-learning은 명확한 한계를 가지기 때문에 실시간으로 Q-table을 업데이트하는 실시간 Q-learning을 사용하였다. 탐험 전략을 조정하여 실시간 Q-learning에 필요한 학습 속도를 얻을 수 있었다. 마지막으로 실시간 Q-learning과 DQN의 성능을 비교하였다.

Reinforcement learning has been applied mainly in sequential decision-making problems. Especially in recent years, reinforcement learning combined with neural networks has brought successful results in previously unsolved fields. However, reinforcement learning using deep neural networks has the disadvantage that it is too complex for immediate use in the field. In this paper, we implemented path planning algorithm for mobile robots using Q-learning, one of the easy-to-learn reinforcement learning algorithms. We used real-time Q-learning to update the Q-table in real-time since the Q-learning method of generating Q-tables in advance has obvious limitations. By adjusting the exploration strategy, we were able to obtain the learning speed required for real-time Q-learning. Finally, we compared the performance of real-time Q-learning and DQN.

키워드

참고문헌

  1. H. T. Cormen, C. E. Leiserson, R. L. Rivest, and Clifford Stein, Introduction to Algorithms,. Second Edition, MIT Press and McGrawHill, 2001.
  2. S. Koenig and M. Likhachev, "D* lite," National Conference on Artificial Intelligence, vol.18, pp. 476-483, 2002.
  3. S. Koenig and M. Likhachev, "Incremental A*," Advances in Neural Information Processing Systems, vol.14, pp.1539-1546, 2002.
  4. R. Sutton and A. Barto, Reinforcement learning, MIT Press, 1996.
  5. Y. Li, C. Li and Z. Zhang, "Q-Learning Based Method of Adaptive Path Planning for Mobile Robot," IEEE International Conference on Information Acquisition, pp.983-987, 2006. DOI: 10.1109/ICIA.2006.305871
  6. D. Tamilselve, S. M. Shalinie and G. Nirmala, "Q Learning for Mobile Robot Navigation in Indoor Environment," IEEE International Conference on Recent Trends in Information Technology, pp.324-329, 2011. DOI: 10.1109/ICRTIT.2011.5972477
  7. J. Muhammad and I. O. Bucak, "An Improved Q-Learning Algorithm for an Autonomous Mobile Robot Navigation Problem," 2013 TAEECE, pp. 239-243, 2013. DOI: 10.1109/TAEECE.2013.6557278
  8. Mnih, Volodymyr, et al, "Playing Atari with Deep Reinforcement Learning," NIPS Deep Learning Workshop 2013, pp.1-9, 2013.
  9. Mnih, Volodymyr, et al, "Human-level control through deep reinforcement learning," Nature, Vol.518, No.7540, pp.529-533, 2015. https://doi.org/10.1038/nature14236

피인용 문헌

  1. Deep Reinforcement Learning-Based Network Routing Technology for Data Recovery in Exa-Scale Cloud Distributed Clustering Systems vol.11, pp.18, 2020, https://doi.org/10.3390/app11188727