DOI QR코드

DOI QR Code

An Auto Obstacle Collision Avoidance System using Reinforcement Learning and Motion VAE

강화학습과 Motion VAE 를 이용한 자동 장애물 충돌 회피 시스템 구현

  • Zheng Si (Department of Computer Science, Hanyang University) ;
  • Taehong Gu (Department of Computer Science, Hanyang University) ;
  • Taesoo Kwon (Department of Computer Science, Hanyang University)
  • 사정 (한양대학교 컴퓨터소프트웨어학과) ;
  • 구태홍 (한양대학교 컴퓨터소프트웨어학과) ;
  • 권태수 (한양대학교 컴퓨터소프트웨어학과)
  • Received : 2024.04.08
  • Accepted : 2024.06.17
  • Published : 2024.09.01

Abstract

In the fields of computer animation and robotics, reaching a destination while avoiding obstacles has always been a difficult task. Moreover, generating appropriate motions while planning a route is even more challenging. Recently, academic circles are actively conducting research to generate character motions by modifying and utilizing VAE (Variational Auto-Encoder), a data-based generation model. Based on this, in this study, the latent space of the MVAE model is learned using a reinforcement learning method[1]. With the policy learned in this way, the character can arrive its destination while avoiding both static and dynamic obstacles with natural motions. The character can easily avoid obstacles moving in random directions, and it is experimentally shown that the performance is improved, and the learning time is greatly reduced compared to existing approach.

컴퓨터 애니메이션 및 로보틱스 분야에서 장애물을 회피하면서 목적지에 도착하는 것은 어려운 과제이다. 특히 이동 경로의 계획과 적절한 동작의 계획을 동시에 수행하는 것은 기존에 많이 다루어지지 않은 연구분야이다. 최근 연구자들은 데이터 기반 생성 모델인 VAE(Variational Auto-Encoder)를 활용하여 캐릭터 모션을 생성하는 문제를 활발하게 연구해왔다. 본 연구에서는 VAE 생성 모델을 동작 공간으로 확장하고, 잠재 공간에서 강화학습을 적용한 MVAE 모델을 활용한다[1]. 이 접근 방식을 통해 학습된 정책을 사용하면 에이전트는 고정된 장애물과 움직이는 장애물을 모두 피하면서, 자연스러운 움직임으로 목적지에 도달할 수 있다. 캐릭터는 무작위 방향으로 이동하는 장애물을 견고하게 회피할 수 있었고, 기존 연구보다 성능이 뛰어나고 학습 시간도 단축됨을 실험적으로 보였다.

Keywords

Acknowledgement

이 논문은 2021 년도 정부(한국전자통신연구원)의 재원으로 정보통신기획평가원의 지원을 받아 수행된 연구임 (No. 2021-0-00320, 실 공간 대상 XR 생성 및 변형/증강 기술 개발

References

  1. Hung Yu Ling, Fabio Zinno, George Cheng, and Michiel van de Panne. 2020. Character Controllers Using Motion VAEs. ACM Trans. Graph. 39, 4, Article 40 (July 2020), 12 pages. https://doi.org/10.1145/3386569.3392422 
  2. Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel van de Panne. 2017. DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning. ACM Trans. Graph. 36, 4, Article 41 (July 2017), 16 pages. DOI: http://dx.doi.org/10.1145/3072959.3073602 
  3. Jaedong Lee, Jungdam Won, and Jehee Lee. 2018. Crowd Simulation by Deep Reinforcement Learning. In Proceedings of Motion, Interaction and Games, Limassol, Cyprus, November 8-10, 2018, 7 pages. DOI: 10.1145/3230744.3230782 
  4. V. Lumelsky and T. Skewis, "Incorporating range sensing in the robot navigation function," IEEE Transactions on Systems Man and Cybernetics, vol. 20, pp. 1058 - 1068, 1990. 
  5. V. Lumelsky and Stepanov, "Path-planning srategies for a point mobile automaton amidst unknown obstacles of arbitrary shape," in Autonomous Robots Vehicles, I.J. Cox, G.T. Wilfong (Eds), New York, Springer, pp. 1058 - 1068, 1990. 
  6. O. Khatib, "Real-time obstacle avoidance for manipulators and mobile robots," International Journal of Robotics Research, vol. 5, no. 1, pp. 90-98, 1995. 
  7. J. Borenstein and Y. Koren, "The vector field histogram - fast obstacle avoidance for mobile robots," IEEE Transaction on Robotics and Automation, vol. 7, no. 3, pp. 278 - 288, 1991.
  8. Volodymyr Mnih, Koray Kavukcuoglu. 2015. Human-level control through deep reinforcement learning. Nature 518 (2015), 529-533. 
  9. David Silver, Aja Huang. 2016. Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature 529, 7587 (2016), 484-489. TensorFlow. 2015. TensorFlow: Large-Scale Machine L 
  10. LEE Y., WAMPLER K., BERNSTEIN G., POPOVIC J., POPOVIC Z.: Motion fields for interactive character locomotion. In ACM SIGGRAPH Asia 2010 papers. 2010, pp. 1-8. 
  11. LEVINE S., WANG J. M., HARAUX A., POPOVIC Z., KOLTUN V.: Continuous character control with low-dimensional embeddings. ACM Transactions on Graphics (TOG) 31, 4 (2012), 1 
  12. COROS S., BEAUDOIN P., VAN DE PANNE M.: Robust task based control policies for physics-based characters. In ACM SIGGRAPH Asia 2009 papers. 2009, pp. 1-9. 
  13. PENG X. B., BERSETH G., VAN DE PANNE M.: Dynamic terrain traversal skills using reinforcement learning. ACM Transactions on Graphics (TOG) 34, 4 (2015), 1-11. 
  14. BROCKMAN G., CHEUNG V., PETTERSSON L., SCHNEIDER J., SCHULMAN J., TANG J., ZAREMBA W.: Openai gym. arXiv preprint arXiv:1606.01540 (2016). 
  15. DUAN Y., CHEN X., HOUTHOOFT R., SCHULMAN J., ABBEEL P.: Benchmarking deep reinforcement learning for continuous control. In International conference on machine learning (2016), PMLR, pp. 1329-1338. 
  16. LEE S., PARK M., LEE K., LEE J.: Scalable muscle-actuated human simulation and control. ACM Transactions On Graphics (TOG) 38, 4 (2019), 1-13. 
  17. Ilya Kostrikov. 2018. PyTorch Implementations of Reinforcement Learning Algorithms. https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail. 
  18. Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics Based Character Skills. ACM Trans. Graph. 37, 4, Article 143 (August 2018), 18 pages. https://doi.org/10.1145/3197517.3201311 
  19. He Zhang, Sebastian Starke, Taku Komura, and Jun Saito. 2018. Mode Adaptive Neural Networks for Quadruped Motion Control. ACM Trans. Graph. 37, 4, Article 145 (August 2018), 11 pages. https://doi.org/10.1145/3197517.3201366 
  20. Taesoo Kwon, Taehong Gu, Jaewon Ahn, and Yoonsang Lee. 2023. Adaptive Tracking of a Single-Rigid-Body Character in Various Environments. In SIGGRAPH Asia 2023 Conference Papers (SA Conference Papers '23), December 12-15, 2023, Sydney, NSW, Australia. ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/3610548.3618187