Evaluation of Human Demonstration Augmented Deep Reinforcement Learning Policies via Object Manipulation with an Anthropomorphic Robot Hand |
Park, Na Hyeon
(경희대학교 전자정보융합공학과)
Oh, Ji Heon (경희대학교 전자정보융합공학과) Ryu, Ga Hyun (경희대학교 전자정보융합공학과) Lopez, Patricio Rivera (경희대학교 전자정보융합공학과) Anazco, Edwin Valarezo (경희대학교 전자정보융합공학과) Kim, Tae Seong (경희대학교 생체의공학과 및 전자정보융합공학과) |
1 | Zhou, Jianshu, et al., "A soft-robotic approach to anthropomorphic robotic hand dexterity," IEEE Access, Vol.7, pp.101483-101495, 2019. DOI |
2 | C. Piazza, et al., "The SoftHand Pro-H: a hybrid body-controlled, electrically powered hand prosthesis for daily living and working," IEEE Robotics & Automation Magazine, Vol.24, No.4, pp.87-101, 2017. |
3 | A. Gupta, C. Eppner, S. Levine, and P. Abbeel, "Learning dexterous manipulation for a soft robotic hand from human demonstrations," 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, pp.3786-3793, 2016. |
4 | A. Firouzeh, and J. Paik, "Grasp mode and compliance control of an underactuated origami gripper using adjustable stiffness joints," IEEE/ASME Transactions on Mechatronics, Vol.22, No.5, pp.2165-2173, 2017. doi: 10.1109/TMECH.2017.2732827. DOI |
5 | Billard, Aude, and Danica Kragic, "Trends and challenges in robot manipulation," Science, 364.6446, 2019. |
6 | Kontoudis GP, Liarokapis MV, Zisimatos AG, Mavrogiannis CI, Kyriakopoulos KJ. "Opensource, anthropomorphic, underactuated robot hands with a selectively lockable differential mechanism:towards affordable prostheses," In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.5857-5862. New York: IEEE. 2015. |
7 | Vecerik, Mel, et al.,"Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards," arXiv preprint arXiv:1707.08817. 2017. |
8 | A. Kargov, et al., "Development of an anthropomorphic hand for a mobile assistive robot," IEEE In 9th International Conference on Rehabilitation Robotics, New York, 2005. pp.182-186. |
9 | Jacobsen S, Iversen E, Knutti D, Johnson R, Biggers K. 1986. "Design of the Utah/M.I.T. dextrous hand." In 1986 IEEE. International Conference on Robotics and Automation, Vol.3, pp.1520-1532, New York: IEEE. |
10 | Shadow Robot Co. 2018. Shadow Dexterous Hand. Shadow Robot Company. https://www.shadowrobot.com/products/dexterous-hand |
11 | Chao, Ya, Xingchen Chen, and Nanfeng Xiao,"Deep learning-based grasp-detection method for a five-fingered industrial robot hand," IET Computer Vision, Vol.13, No.1, pp.61-70, 2018. DOI |
12 | N. Kohl and P. Stone, "Policy gradient reinforcement learning for fast quadrupedal locomotion," IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004, New Orleans, LA, USA, 2004, pp. 2619-2624 Vol.3, doi: 10.1109/ROBOT.2004.1307456. DOI |
13 | A.Y. Ng, et al., "Autonomous Inverted Helicopter Flight via Reinforcement Learning," In: Ang M.H., Khatib O. (eds) Experimental Robotics IX. Springer Tracts in Advanced Robotics, Vol.21. Springer, Berlin, Heidelberg. 2006. https://doi.org/10.1007/11552246_35. |
14 | V. Mnih, et al., "Human-level control through deep reinforcement learning," Nature, Vol.518, No.7540, pp.529-533, 2015. https://doi.org/10.1038/nature14236. DOI |
15 | N. Correll, et al., "Analysis and observations from the first Amazon Picking Challenge," IEEE Transactions on Automation Science and Engineering, Vol.15, No.1, pp.172-188, 2018. DOI |
16 | A. Nair, B. McGrew, M. Andrychowicz, W. Zaremba, and P. Abbeel, "Overcoming Exploration in Reinforcement Learning with Demonstrations," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, 2018, pp.6292-6299, doi: 10.1109/ICRA.2018.8463162. DOI |
17 | D. Silver, et al., "Mastering the game of Go with deep neural networks and tree search," Nature, Vol.529, No.7587, pp.484-489, 2016. https://doi.org/10.1038/nature16961 DOI |
18 | E. Valarezo Anazco, et al., "Natural object manipulation using anthropomorphic robotic hand through deep reinforcement learning and deep grasping probability network," Applied Intelligence, Vol.51, No.2, pp.1041-1055, 2021. https://doi.org/10.1007/s10489-020-01870-6 DOI |
19 | Edwin Valarezo Anazco, Patricio Rivera Lopez, Hyemin Park, Nahyeon Park, Jiheon Oh, Sangmin Lee, Kyungmin Byun, and Tae-Seong Kim. "Human-like Object Grasping and Relocation for an Anthropomorphic Robotic Hand with Natural Hand Pose Priors in Deep Reinforcement Learning," In Proceedings of the 2019 2nd International Conference on Robot Systems and Applications (ICRSA 2019). Association for Computing Machinery, New York, NY, USA, 46-50. DOI:https://doi.org/10.1145/3378891.3378900 DOI |
20 | Gao, Yang, et al.,"Reinforcement learning from imperfect demonstrations," arXiv preprint arXiv:1802.05313. 2018. |
21 | Hester, Todd, et al.,"Learning from demonstrations for real world reinforcement learning," 2017. |
22 | Osa, Takayuki, Jan Peters, and Gerhard Neumann, "Hierarchical reinforcement learning of multiple grasping strategies with human instructions," Advanced Robotics, Vol.32, No.18, pp.955-968, 2018. DOI |
23 | Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, S. M. Ali Eslami, Martin A. Riedmiller, and David Silver, "Emergence of Locomotion Behaviours in Rich Environments," 2017. CoRR abs/1707.02286. arXiv:1707.02286 |
24 | J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal Policy Optimization Algorithms," arXiv:1707.06347v2 [cs.LG]. |
25 | Leap Motion [Internet], https://www.ultraleap.com/ |
26 | E. Todorov, T. Erez, and Y. Tassa, "MuJoCo: A physics engine for model-based control," 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, 2012, pp.5026-5033. |
27 | V. Kumar, Z. Xu, and E. Todorov, "Fast, strong and compliant pneumatic actuation for dexterous tendon-driven hands," 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, pp.1512-1519, 2013. |
28 | Ahmed Hussein, Mohamed Medhat Gaber, Eyad Elyan, and Chrisina Jayne, "Imitation Learning: A Survey of Learning Methods," ACM Computing Surveys, Vol.50, No.2, Article 21, pp.1-35, 2017. |
29 | S. Kakade, "Natural Policy Gradient," Neural Information Processing systems (NIPS), 14:1531-1538. 2001. |
30 | J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz. "Trust Region Policy Optimization," Proceedings of the 32nd International Conference on Machine Learning, PMLR 2015, 37: 1889-1897. |
31 | S. Gu, E. Holly, T. Lillicrap, S. Levine, "Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates," 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, pp.3389-3396, 2017. |
32 | A. Rajeswaran, V. Kumar, A. Gupta, G. Vezzani, J. Schulman, E. Todorov, and S. Levine. "Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations," arXiv:1709.10087v2 [cs.LG]. 2018. |
33 | C. Piazza, G. Grioli, M. G. Catalano, and A. Bicchi, "A century of robotic hands," Annual Review of Control, Robotics, and Autonomous Systems, Vol.2, pp.1-32, 2019. DOI |