DOI QR코드

DOI QR Code

ROV Manipulation from Observation and Exploration using Deep Reinforcement Learning

  • 투고 : 2017.07.17
  • 심사 : 2017.09.28
  • 발행 : 2017.09.30

초록

The paper presents dual arm ROV manipulation using deep reinforcement learning. The purpose of this underwater manipulator is to investigate and excavate natural resources in ocean, finding lost aircraft blackboxes and for performing other extremely dangerous tasks without endangering humans. This research work emphasizes on a self-learning approach using Deep Reinforcement Learning (DRL). DRL technique allows ROV to learn the policy of performing manipulation task directly, from raw image data. Our proposed architecture maps the visual inputs (images) to control actions (output) and get reward after each action, which allows an agent to learn manipulation skill through trial and error method. We have trained our network in simulation. The raw images and rewards are directly provided by our simple Lua simulator. Our simulator achieve accuracy by considering underwater dynamic environmental conditions. Major goal of this research is to provide a smart self-learning way to achieve manipulation in highly dynamic underwater environment. The results showed that a dual robotic arm trained for a 3DOF movement successfully achieved target reaching task in a 2D space by considering real environmental factor.

키워드

참고문헌

  1. Andrej, K., and Li, F.-F. (2015). Deep Visual-Semantic Alignments for Generating Image Descriptions. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3128-3137.
  2. Carreras, M., Yuh, J., Batlle, J., and Ridao, P., (2007). Application of SONQL for real-time learning of robot behaviors. Robotics and autonomous systems, 55(8), 628-642. https://doi.org/10.1016/j.robot.2007.03.003
  3. Chentanez, N., Barto, A.G., and Singh, S.P. (2005). Intrinsically motivated reinforcement learning. In Advances in neural information processing systems, 1281-1288.
  4. Hausknecht, M., and Stone, P. (2015). Deep Recurrent Q-Learning for Partially Observable MDPs. In Sequential Decision Making for Intelligent Agents Papers from the AAAI 2015 Fall Symposium).
  5. Levine, S., Finn, C., Darrell, T., and Abbeel, P.,( 2016). End-to-end training of deep visuomotor policies. Journal of Machine Learning Research, 17(39), 1-40.
  6. Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., and Quillen, D.,( 2016). Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research.
  7. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., and Hassabis, D., (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. https://doi.org/10.1038/nature14236
  8. Zhang, F., Leitner, J., Milford, M., Upcroft, B., and Corke, P.I. (2015). Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control. In Proceedings of Australasian Conference on Robotics and Automation (ACRA)). Australia.
  9. Zhang, F., Leitner, J., Upcroft, B., and Corke, P., (2016). Vision-Based Reaching Using Modular Deep Networks: from Simulation to the Real World. arXiv preprint arXiv:1610.06781.