Browse > Article
http://dx.doi.org/10.7746/jkros.2020.15.4.404

Cooperative Robot for Table Balancing Using Q-learning  

Kim, Yewon (Mechanical Engineering, Kyungpook National University)
Kang, Bo-Yeong (Mechanical Engineering, Kyungpook National University)
Publication Information
The Journal of Korea Robotics Society / v.15, no.4, 2020 , pp. 404-412 More about this Journal
Abstract
Typically everyday human life tasks involve at least two people moving objects such as tables and beds, and the balancing of such object changes based on one person's action. However, many studies in previous work performed their tasks solely on robots without factoring human cooperation. Therefore, in this paper, we propose cooperative robot for table balancing using Q-learning that enables cooperative work between human and robot. The human's action is recognized in order to balance the table by the proposed robot whose camera takes the image of the table's state, and it performs the table-balancing action according to the recognized human action without high performance equipment. The classification of human action uses a deep learning technology, specifically AlexNet, and has an accuracy of 96.9% over 10-fold cross-validation. The experiment of Q-learning was carried out over 2,000 episodes with 200 trials. The overall results of the proposed Q-learning show that the Q function stably converged at this number of episodes. This stable convergence determined Q-learning policies for the robot actions. Video of the robotic cooperation with human over the table balancing task using the proposed Q-Learning can be found at http://ibot.knu.ac.kr/videocooperation.html.
Keywords
Cooperative Robot; Reinforcement Learning; Q-learning; Image Processing; Classification; AI (Artificial Intelligence); NAO Robot;
Citations & Related Records
연도 인용수 순위
  • Reference
1 O. Asik, B. Gorer, and H. L. Akin, "End-to-End Deep Imitation Learning: Robot Soccer Case Study," arXiv preprint arXiv:1807.09205, 2018, [Online], https://arxiv.org/abs/1807.09205.
2 A. Thobbi, Y. Gu, and W. Sheng, "Using human motion estimation for human-robot cooperative manipulation," 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, CA, USA, pp. 2873-2878, 2011, DOI: 10.1109/iros.2011.6094904.   DOI
3 Y. Gu, A. Thobbi, and W. Sheng, "Human-robot collaborative manipulation through imitation and reinforcement learning," 2011 IEEE International Conference on Information and Automation, Shenzhen, China, pp. 151-156, 2011, DOI: 10.1109/ICINFA.2011.5948979.   DOI
4 Vicon, [Online], https://www.vicon.com, Accessed: Sep. 3, 2020.
5 SoftBank Robotics, [Online], https://www.softbankrobotics.com, Accessed: Sep. 3, 2020.
6 D. M. Powers, "Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation," Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37-63, Dec., 2011, [Online], http://hdl.handle.net/2328/27165.
7 K. Lobos-Tsunekawa, F. Leiva, and J. Ruiz-Del-Solar, "Visual navigation for biped humanoid robots using deep reinforcement learning," IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3247-3254, Oct., 2018, DOI: 10.1109/LRA.2018.2851148.   DOI
8 S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen, "Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection," The International Journal of Robotics Research, vol. 37, no. 4-5, pp. 421-436, 2018, DOI: 10.1177/0278364917710318.   DOI
9 P.-C. Yang, K. Sasaki, K. Suzuki, K. Kase, S. Sugano, and T. Ogata, "Repeatable Folding Task by Humanoid Robot Worker Using Deep Learning," IEEE Robotics and Automation Letters, vol. 2, no. 2, pp. 397-403, Apr., 2017, DOI: 10.1109/LRA.2016.2633383.   DOI
10 C. Wang, K. V. Hindriks, and R. Babuska, "Active learning of affordances for robot use of household objects," 2014 IEEE-RAS International Conference on Humanoid Robots, Madrid, Spain, pp. 566-572, 2014, DOI: 10.1109/HUMANOIDS.2014.7041419.   DOI
11 X. B. Peng, G. Berseth, and M. Van De Panne, "Terrain-adaptive locomotion skills using deep reinforcement learning," ACM Transactions on Graphics, vol. 35, no. 4, pp. 1-12, 2016, DOI: 10.1145/2897824.2925881.   DOI
12 H. B. Suay and S. Chernova, "Effect of human guidance and state space size on Interactive Reinforcement Learning," 2011 IEEE International Symposium on Robot and Human Interactive Communication, Atlanta, GA, USA, pp. 1-6, 2011, DOI: 10.1109/ROMAN.2011.6005223.   DOI
13 A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Advances in Neural Information Processing Systems 25 (NIPS 2012), pp. 1097-1105, 2012, [Online], http://papers.nips.cc/paper/4824-imagenetclassification-with-deep-convolutional-neural-networ.
14 J. Merel, Y. Tassa, D. TB, S. Srinivasan, J. Lemmon, Z. Wang, G. Wayne, and N. Heess, "Learning human behaviors from motion capture by adversarial imitation," arXiv preprint arXiv:1707.02201, 2017, [Online], https://arxiv.org/abs/1707.02201.
15 V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, "Playing Atari with Deep Reinforcement Learning," arXiv preprint arXiv:1312.5602, Dec., 2013, [Online], https://arxiv.org/abs/1312.5602.
16 T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, 2015, [Online], https://arxiv.org/abs/1509.02971.
17 J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, "Trust Region Policy Optimization," 32nd International Conference on Machine Learning, pp. 1889-1897, 2015, [Online], http://proceedings.mlr.press/v37/schulman15.html.
18 J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal Policy Optimization Algorithms," arXiv preprint arXiv: 1707.06347, 2017, [Online], https://arxiv.org/abs/1707.06347.
19 S. Wen, X. Chen, C. Ma, H. K. Lam, and S. Hua, "The Q-learning obstacle avoidance algorithm based on EKF-SLAM for NAO autonomous walking under unknown environments," Robotics and Autonomous Systems, vol. 72, pp. 29-36, Oct., 2015, DOI: 10.1016/j.robot.2015.04.003.   DOI
20 What's new, Atlas?, [Online], https://www.youtube.com/watch?v=fRj34o4hN4I, Accessed: Mar. 20, 2020.
21 M. Danel, "Reinforcement learning for humanoid robot control," POSTER 2017, Prague, Czech Republic, 2017, [Online], http://poseidon2.feld.cvut.cz/conf/poster/proceedings/Poster_2017/Section_IC/IC_021_Danel.pdf.
22 F. Stulp, J. Buchli, E. Theodorou, and S. Schaal, "Reinforcement learning of full-body humanoid motor skills," 2020 10th IEEERAS International Conference on Humanoid Robots, Nashville, TN, USA, pp. 405-410, 2010, DOI: 10.1109/ICHR.2010.5686320.   DOI
23 D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, "Mastering the game of Go with deep neural networks and tree search," Nature, vol. 529, no. 7587, pp. 484-489, 2016, DOI: 10.1038/nature16961.   DOI
24 K. Mulling, J. Kober, O. Kroemer, and J. Peters, "Learning to select and generalize striking movements in robot table tennis," The International Journal of Robotics Research, vol. 32, no. 3, pp. 263-279, 2013, DOI: 10.1177/0278364912472380.   DOI
25 S. Debnath and J. Nassour, "Extending cortical-basal inspired reinforcement learning model with success-failure experience," 4th IEEE International Conference on Development and Learning and on Epigenetic Robotics, Genoa, Italy, pp. 293-298, 2014, DOI: 10.1109/DEVLRN.2014.6982996.   DOI