Development of an Actor-Critic Deep Reinforcement Learning Platform for Robotic Grasping in Real World

Kim, Taewon;Park, Yeseong;Kim, Jong Bok;Park, Youngbin;Suh, Il Hong;

doi:10.7746/jkros.2020.15.2.197

The Journal of Korea Robotics Society (로봇학회논문지)

Volume 15 Issue 2
/
Pages.197-204
/
2020
/
1975-6291(pISSN)
/
2287-3961(eISSN)

Korea Robotics Society (한국로봇학회)

DOI QR Code

Development of an Actor-Critic Deep Reinforcement Learning Platform for Robotic Grasping in Real World

현실 세계에서의 로봇 파지 작업을 위한 정책/가치 심층 강화학습 플랫폼 개발

Kim, Taewon (Department of Electronics and Computer Engineering, Hanyang University) ;
Park, Yeseong (Department of Electronics and Computer Engineering, Hanyang University) ;
Kim, Jong Bok (Department of Electronics and Computer Engineering, Hanyang University) ;
Park, Youngbin (Department of Electronics and Computer Engineering, Hanyang University) ;
Suh, Il Hong (Department of Electronics and Computer Engineering, Hanyang University)

Received : 2020.03.04
Accepted : 2020.04.21
Published : 2020.05.31

https://doi.org/10.7746/jkros.2020.15.2.197 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we present a learning platform for robotic grasping in real world, in which actor-critic deep reinforcement learning is employed to directly learn the grasping skill from raw image pixels and rarely observed rewards. This is a challenging task because existing algorithms based on deep reinforcement learning require an extensive number of training data or massive computational cost so that they cannot be affordable in real world settings. To address this problems, the proposed learning platform basically consists of two training phases; a learning phase in simulator and subsequent learning in real world. Here, main processing blocks in the platform are extraction of latent vector based on state representation learning and disentanglement of a raw image, generation of adapted synthetic image using generative adversarial networks, and object detection and arm segmentation for the disentanglement. We demonstrate the effectiveness of this approach in a real environment.

Keywords

References

D. Quillen, E. Jang, O. Nachum, C. Finn, J. Ibarz, and S. Levine, "Deep reinforcement learning for vision-based robotic grasping: A simulated comparative evaluation of off-policy methods," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, DOI: 10.1109/icra.2018.8461039.
D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog, E. Jang, D. Quillen, E. Holly, M. Kalakrishnan, V. Vanhoucke, and S. Levine, "Qt-opt: Scalable deep reinforcement learning for visionbased robotic manipulation," arXiv:1806.10293, 2018, [Online], https://arxiv.org/abs/1806.10293.
T. Kim, Y. Park, Y. Park, and I. H. Suh, "Acceleration of Actor- Critic Deep Reinforcement Learning for Visual Grasping in Clutter by State Representation Learning Based on Disentanglement of a Raw Input Image," arXiv:2002.11903, 2020, [Online], https://arxiv.org/abs/2002.11903.
J. Mahler, M. Matl, X. Liu, A. Li, D. Gealy, and K. Goldberg, "Dex-Net 3.0: Computing robust vacuum suction grasp targets in point clouds using a new analytic model and deep learning," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, DOI: 10.1109/icra.2018.8460887.
S. Levine, P. Pastor, A. Krizhevsky, and D. Quillen, "Learning hand-eye coordination for robotic grasping with large-scale data collection," International Symposium on Experimental Robotics, pp. 173-184, 2016, DOI: 10.1007/978-3-319-50115-4_16.
C. Finn, X. Y. Tan, Y. Duan, T. Darrell, S. Levine, and P. Abbeel, "Deep spatial autoencoders for visuomotor learning," 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 2016, DOI: 10.1109/icra.2016.7487173.
H. van Hoof, N. Chen, M. Karl, P. van der Smagt, and J. Peters, "Stable reinforcement learning with autoencoders for tactile and visual data," 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, South Korea, 2016, DOI: 10.1109/iros.2016.7759578.
I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner, "beta-VAE: Learning basic visual concepts with a constrained variational framework," ICLR, 2017, [Online], https://www.semanticscholar.org/paper/beta-VAE%3A-Learning-Basic-Visual-Concepts-with-a-Higgins-Matthey/a90226c41b79f8b06007609f39f82757073641e2.
K. Bousmalis, A. Irpan, P. Wohlhart, Y. Bai, M. Kelcey, M. Kalakrishnan, L. Downs, J. Ibarz, P. Pastor, K. Konolige, S. Levine, and V. Vanhoucke, "Using simulation and domain adaptation to improve efficiency of deep robotic grasping," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, DOI: 10.1109/ICRA.2018.8460875.
G. Barth-Maron, M. W. Hoffman, D. Budden, W. Dabney, D. Horgan, A. Muldal, N. Heess, and T. Lillicrap, "Distributed distributional deterministic policy gradients," arXiv:1804.08617, 2018, [Online], https://arxiv.org/abs/1804.08617.
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," arXiv:1509.02971, 2016, [Online], https://arxiv.org/abs/1509.02971.
P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion," Journal of Machine Learning Research, 2010, [Online], https://www.semanticscholar.org/paper/Stacked-Denoising-Autoencoders%3A-Learning-Useful-in-Vincent-Larochelle/e2b7f37cd97a7907b1b8a41138721ed06a0b76cd.
J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv:1804.02767, 2018, [Online], https://arxiv.org/abs/1804.02767.
D. P. Kingma and M. Welling, "Auto-encoding variational Bayes," arXiv:1312.6114, 2013, [Online], https://arxiv.org/abs/1312.6114.
S. Ren, K. He, R. Grisshick, and J. Sun, "Faster R-CNN: Toward Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, DOI: 10.1109/tpami.2016.2577031.
A. Lukezic, T. Vojir, L. C. Zaic, J. Matas, and M. Krstan, "Discriminative Correlation Filter Tracker with Channel and Spatial Reliability," International Journal of Computer Vision, 2019, DOI: 10.1007/s11263-017-1061-3.
V. Badrinarayanan, A. Kendall, and R. Cipolla, "Segnet: a deep convolutional encoder-decoder architecture for image segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 12, pp. 2481-2495, 2017, DOI: 10.1109/tpami.2016.2644615.

The Journal of Korea Robotics Society (로봇학회논문지)

Development of an Actor-Critic Deep Reinforcement Learning Platform for Robotic Grasping in Real World

현실 세계에서의 로봇 파지 작업을 위한 정책/가치 심층 강화학습 플랫폼 개발

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)