DOI QR코드

DOI QR Code

Mapless Navigation with Distributional Reinforcement Learning

분포형 강화학습을 활용한 맵리스 네비게이션

  • Van Manh Tran (Intelligent Systems and Robotics, Chungbuk National University) ;
  • Gon-Woo Kim (Intelligent Systems and Robotics, Chungbuk National University)
  • Received : 2023.10.31
  • Accepted : 2023.11.24
  • Published : 2024.02.29

Abstract

This paper provides a study of distributional perspective on reinforcement learning for application in mobile robot navigation. Mapless navigation algorithms based on deep reinforcement learning are proven to promising performance and high applicability. The trial-and-error simulations in virtual environments are encouraged to implement autonomous navigation due to expensive real-life interactions. Nevertheless, applying the deep reinforcement learning model in real tasks is challenging due to dissimilar data collection between virtual simulation and the physical world, leading to high-risk manners and high collision rate. In this paper, we present distributional reinforcement learning architecture for mapless navigation of mobile robot that adapt the uncertainty of environmental change. The experimental results indicate the superior performance of distributional soft actor critic compared to conventional methods.

Keywords

Acknowledgement

This work was supported by the Technology Innovation Program (or Industrial Strategic Technology Development Program-ATC+) (20009546, Development of service robot core technology that can provide advanced service in real life) funded By the Ministry of Trade, Industry & Energy (MOTIE, Korea)

References

  1. J. Jin, N. M . Nguyen, N. Sakib, D. Graves, H. Yao, and M. Jagersand, "Mapless navigation among dynamics with social-safety-awareness: a reinforcement learning approach from 2d laser scans," 2020 IEEE international conference on robotics and automation (ICRA), Paris, France, pp. 6979-6985, 2020, DOI: 10.1109/ICRA40945.2020.9197148. 
  2. T. Fan, X. Cheng, J. Pan, D. Manocha, and R. Yang, "Crowdmove: Autonomous mapless navigation in crowded scenarios," ArXiv, Jul., 2018, [Online]. https://api.semanticscholar.org/CorpusID:49904993. 
  3. L. Tai, G. Paolo, and M. Liu, "Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation," 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, pp. 31-36, 2017, DOI: 10.1109/IROS.2017.8202134. 
  4. T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, "Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor," the 35th International Conference on Machine Learning, pp. 1861-1870, 2018, [Online], https://proceedings.mlr.press/v80/haarnoja18b.html. 
  5. W. Dabney, M. Rowland, M. Bellemare, and R. Munos, "Distributional reinforcement learning with quantile regression," AAAI Conference on Artificial Intelligence, vol. 32, no. 1, Apr., 2018, DOI: 10.1609/aaai.v32i1.11791. 
  6. C. Liu, E.-J. van Kampen, and G. C. H. E. de Croon, "Adaptive Risk-Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning," 2023 IEEE International Conference on Robotics and Automation (ICRA), London, United Kingdom, pp. 7198-7204, 2023, DOI: 10.1109/ICRA48891.2023.10160324. 
  7. J. Duan, Y. Guan, S. E. Li, Y. Ren, Q. Sun, and B. Cheng, "Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors," IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 11, pp. 6584-6598, Nov., 2022, DOI: 10.1109/TNNLS.2021.3082568. 
  8. W. Dabney, G. Ostrovski, D. Silver, and R. Munos, "Implicit quantile networks for distributional reinforcement learning," the 35th International Conference on Machine Learning, pp. 1096-1105, 2018, [Online], https://proceedings.mlr.press/v80/dabney18a.html. 
  9. W. Zhu and M. Hayashibe, "A Hierarchical Deep Reinforcement Learning Framework With High Efficiency and Generalization for Fast and Safe Navigation," in IEEE Transactions on Industrial Electronics, vol. 70, no. 5, pp. 4962-4971, May, 2023, DOI: 10.1109/TIE.2022.3190850. 
  10. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal policy optimization algorithms," arXiv:1707.06347, 2017, DOI: 10.48550/arXiv.1707.06347. 
  11. M. Labbe and F. Michaud, "Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation," IEEE Transactions on Robotics, vol. 29, no. 3, pp. 734-745, Jun., 2013, DOI: 10.1109/TRO.2013.2242375.