[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7746/jkros.2020.15.2.197

Development of an Actor-Critic Deep Reinforcement Learning Platform for Robotic Grasping in Real World

Kim, Taewon (Department of Electronics and Computer Engineering, Hanyang University)
Park, Yeseong (Department of Electronics and Computer Engineering, Hanyang University)
Kim, Jong Bok (Department of Electronics and Computer Engineering, Hanyang University)
Park, Youngbin (Department of Electronics and Computer Engineering, Hanyang University)
Suh, Il Hong (Department of Electronics and Computer Engineering, Hanyang University)

Publication Information

The Journal of Korea Robotics Society / v.15, no.2, 2020 , pp. 197-204 More about this Journal

Abstract

In this paper, we present a learning platform for robotic grasping in real world, in which actor-critic deep reinforcement learning is employed to directly learn the grasping skill from raw image pixels and rarely observed rewards. This is a challenging task because existing algorithms based on deep reinforcement learning require an extensive number of training data or massive computational cost so that they cannot be affordable in real world settings. To address this problems, the proposed learning platform basically consists of two training phases; a learning phase in simulator and subsequent learning in real world. Here, main processing blocks in the platform are extraction of latent vector based on state representation learning and disentanglement of a raw image, generation of adapted synthetic image using generative adversarial networks, and object detection and arm segmentation for the disentanglement. We demonstrate the effectiveness of this approach in a real environment.

Keywords

Actor-Critic Deep Reinforcement Learning; Robotic Grasping;

Citations & Related Records

Reference

1	D. Quillen, E. Jang, O. Nachum, C. Finn, J. Ibarz, and S. Levine, "Deep reinforcement learning for vision-based robotic grasping: A simulated comparative evaluation of off-policy methods," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, DOI: 10.1109/icra.2018.8461039.
2	D. Kalashnikov, A. Irpan, P. Pastor, J. Ibarz, A. Herzog, E. Jang, D. Quillen, E. Holly, M. Kalakrishnan, V. Vanhoucke, and S. Levine, "Qt-opt: Scalable deep reinforcement learning for visionbased robotic manipulation," arXiv:1806.10293, 2018, [Online], https://arxiv.org/abs/1806.10293.
3	T. Kim, Y. Park, Y. Park, and I. H. Suh, "Acceleration of Actor- Critic Deep Reinforcement Learning for Visual Grasping in Clutter by State Representation Learning Based on Disentanglement of a Raw Input Image," arXiv:2002.11903, 2020, [Online], https://arxiv.org/abs/2002.11903.
4	J. Mahler, M. Matl, X. Liu, A. Li, D. Gealy, and K. Goldberg, "Dex-Net 3.0: Computing robust vacuum suction grasp targets in point clouds using a new analytic model and deep learning," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, DOI: 10.1109/icra.2018.8460887.
5	G. Barth-Maron, M. W. Hoffman, D. Budden, W. Dabney, D. Horgan, A. Muldal, N. Heess, and T. Lillicrap, "Distributed distributional deterministic policy gradients," arXiv:1804.08617, 2018, [Online], https://arxiv.org/abs/1804.08617.
6	S. Levine, P. Pastor, A. Krizhevsky, and D. Quillen, "Learning hand-eye coordination for robotic grasping with large-scale data collection," International Symposium on Experimental Robotics, pp. 173-184, 2016, DOI: 10.1007/978-3-319-50115-4_16.
7	C. Finn, X. Y. Tan, Y. Duan, T. Darrell, S. Levine, and P. Abbeel, "Deep spatial autoencoders for visuomotor learning," 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 2016, DOI: 10.1109/icra.2016.7487173.
8	H. van Hoof, N. Chen, M. Karl, P. van der Smagt, and J. Peters, "Stable reinforcement learning with autoencoders for tactile and visual data," 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, South Korea, 2016, DOI: 10.1109/iros.2016.7759578.
9	I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner, "beta-VAE: Learning basic visual concepts with a constrained variational framework," ICLR, 2017, [Online], https://www.semanticscholar.org/paper/beta-VAE%3A-Learning-Basic-Visual-Concepts-with-a-Higgins-Matthey/a90226c41b79f8b06007609f39f82757073641e2.
10	K. Bousmalis, A. Irpan, P. Wohlhart, Y. Bai, M. Kelcey, M. Kalakrishnan, L. Downs, J. Ibarz, P. Pastor, K. Konolige, S. Levine, and V. Vanhoucke, "Using simulation and domain adaptation to improve efficiency of deep robotic grasping," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2018, DOI: 10.1109/ICRA.2018.8460875.
11	T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," arXiv:1509.02971, 2016, [Online], https://arxiv.org/abs/1509.02971.
12	S. Ren, K. He, R. Grisshick, and J. Sun, "Faster R-CNN: Toward Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, DOI: 10.1109/tpami.2016.2577031.
13	P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, "Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion," Journal of Machine Learning Research, 2010, [Online], https://www.semanticscholar.org/paper/Stacked-Denoising-Autoencoders%3A-Learning-Useful-in-Vincent-Larochelle/e2b7f37cd97a7907b1b8a41138721ed06a0b76cd.
14	J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv:1804.02767, 2018, [Online], https://arxiv.org/abs/1804.02767.
15	D. P. Kingma and M. Welling, "Auto-encoding variational Bayes," arXiv:1312.6114, 2013, [Online], https://arxiv.org/abs/1312.6114.
16	A. Lukezic, T. Vojir, L. C. Zaic, J. Matas, and M. Krstan, "Discriminative Correlation Filter Tracker with Channel and Spatial Reliability," International Journal of Computer Vision, 2019, DOI: 10.1007/s11263-017-1061-3.
17	V. Badrinarayanan, A. Kendall, and R. Cipolla, "Segnet: a deep convolutional encoder-decoder architecture for image segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 12, pp. 2481-2495, 2017, DOI: 10.1109/tpami.2016.2644615. DOI

KSCI

Development of an Actor-Critic Deep Reinforcement Learning Platform for Robotic Grasping in Real World 현실 세계에서의 로봇 파지 작업을 위한 정책/가치 심층 강화학습 플랫폼 개발

Development of an Actor-Critic Deep Reinforcement Learning Platform for Robotic Grasping in Real World