Learning Optimal Trajectory Generation for Low-Cost Redundant Manipulator using Deep Deterministic Policy Gradient(DDPG)

Lee, Seunghyeon;Jin, Seongho;Hwang, Seonghyeon;Lee, Inho;

doi:10.7746/jkros.2022.17.1.058

The Journal of Korea Robotics Society (로봇학회논문지)

Volume 17 Issue 1
/
Pages.58-67
/
2022
/
1975-6291(pISSN)
/
2287-3961(eISSN)

Korea Robotics Society (한국로봇학회)

DOI QR Code

Learning Optimal Trajectory Generation for Low-Cost Redundant Manipulator using Deep Deterministic Policy Gradient(DDPG)

저가 Redundant Manipulator의 최적 경로 생성을 위한 Deep Deterministic Policy Gradient(DDPG) 학습

Lee, Seunghyeon (Pusan National University) ;
Jin, Seongho (Pusan National University) ;
Hwang, Seonghyeon (Pusan National University) ;
Lee, Inho (Dept of Electronics Engineering Pusan National University)

Received : 2021.12.16
Accepted : 2022.02.16
Published : 2022.02.28

https://doi.org/10.7746/jkros.2022.17.1.058 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we propose an approach resolving inaccuracy of the low-cost redundant manipulator workspace with low encoder and low stiffness. When the manipulators are manufactured with low-cost encoders and low-cost links, the robots can run into workspace inaccuracy issues. Furthermore, trajectory generation based on conventional forward/inverse kinematics without taking into account inaccuracy issues will introduce the risk of end-effector fluctuations. Hence, we propose an optimization for the trajectory generation method based on the DDPG (Deep Deterministic Policy Gradient) algorithm for the low-cost redundant manipulators reaching the target position in Euclidean space. We designed the DDPG algorithm minimizing the distance along with the jacobian condition number. The training environment is selected with an error rate of randomly generated joint spaces in a simulator that implemented real-world physics, the test environment is a real robotic experiment and demonstrated our approach.

Keywords

Acknowledgement

This work was supported by BK21FOUR, Creative Human Resource Education and Research Programs for ICT Convergence in the 4th Industrial Revolution This work was supported by Pusan National University Research Grant, 2020

References

M. M. Fateh and H. Farhangfard, "On the Transf orming of Control Space by Manipulator Jacobian," Institute of Control, Robotics and Systems, vol. 6, no.1, pp. 101-108, 2008, [Online], https://www.koreascience.or.kr/article/JAKO200809906440883.pdf.
D. L. Pieper, "The Kinematics of Manipulators Under Computer Control," Stanford University, 1968, [Online], https://www.proquest.com.
M. A. Ali, H. Andy Park and C. S. G. Lee, "Closed-form inverse kinematic joint solution for humanoid robots," 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 2010, DOI:10.1109/IROS.2010.5649842.
D. Jang and S. Yoo, "Integrated System of Mobile Manipulator with Speech Recognition and Deep Learning-based Object Detection," The Journal of Korea Robotics Society, vol. 16, no. 3, pp. 270-275, Sept., 2021, DOI:10.7746/jkros.2021.16.3.270.
D. I. Park, C. H. Park, D. H. Kim, and J. H. Kyung, "Analysis and Design of the Dual Arm Manipulator for Rescue Robot," The Journal of Korea Robotics Society, vol. 11, no. 4, pp. 235-241, Dec., 2016, DOI:10.7746/jkros.2016.11.4.235.
Z. Bingul, H. M. Ertunc, and C. Oysu, "Applying Neural Network to Inverse Kinematic Problem for 6R Robot Manipulator with Offset Wrist," Adaptive and Natural Computing Algorithms, pp. 112-115, 2005, DOI: 10.1007/3-211-27389-1_27.
Y.-H. Kim, H. Kang, and H.-T. Jeon, "Planning a minimum time path for robot manipulator using Genetic Algorithm," The Institute of Electronics and Information Engineers, pp. 698-702, 1992, [Online], https://www.koreascience.or.kr/article/CFKO199211919700824.pdf.
E. Prianto, M. S. Kim, J.-H. Park, J.-H. Bae, and J.-S. Kim, "Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor-Critic with Hindsight Experience Replay," Sensors 2020, vol. 20, no. 20, 2020, DOI: 10.3390/s20205911.
J.-Y. Moon, J.-H. Moon, and S.-H. Bae, "Control f or Manipulator of an Underwater Robot Using Meta Reinforcement Learning," The Journal of the Korea institute of electronic communication sciences, vol. 16, no. 1, pp. 95-100, 2021, DOI: 10.13067/JKIECS.2021.16.1.95.
A. Iriondo, E. Lazkano, L. Susperregi, J. Urain, A. Fernandez, and J. Molina, "Pick and Place Operations in Logistics Using a Mobile Manipulator Controlled with Deep Reinforcement Learning," Applied Sciences, vol. 9, no. 2, 2019, DOI:10.3390/app9020348.
M. Duguleana, F. G. Barbuceanu, A. Teirelbar, and G. Mogan, "Obstacle avoidance of redundant manipulators using neural networks based reinforcement learning," Robotics and Computer-Integrated Manufacturing, vol. 28, no. 2, pp. 132-146, Apr., 2012, DOI:10.1016/j.rcim.2011.07.004.
J. D. Norwood, "A neural network approach to the redundant robot inverse kinematic problem in the presence of obstacles," PhD thesis. Rice University, 1991, [Online], https://www.proquest.com.
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," arXiv:1509.02971 [cs.LG], 2015, [Online], https://arxiv.org/abs/1509.02971.
J. Lee, K. Kim, Y. Kim, and J. Lee, "Singularity Avoidance Path Planning on Cooperative Task of Dual Manipulator Using DDPG Algorithm," The Journal of Korea Robotics Society, vol. 16, no. 2, pp. 137-146, Jun., 2021, DOI: 10.7746/jkros.2021.16.2.137.
J. P. Merlet, "Jacobian, Manipulability, Condition Number, and Accuracy of Parallel Robots," Journal of Mechanical Design, vol. 128, no. 1, pp. 199-206, 2005, DOI: 10.1115/1.2121740.
M. Tokic, "Adaptive ε-Greedy Exploration in Reinforcement Learning Based on Value Differences," Annual Conference on Artificial Intelligence, pp. 203-210, 2010, DOI: 10.1007/978-3-642-16111-7_23.
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, "Playing Atari with Deep Reinforcement Learning," arXiv:1312.5602 [cs.LG], 2013, [Online], https://arxiv.org/abs/1312.5602.
D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, "Deterministic Policy Gradient Algorithms," 31st International Conference on Machine Learning, vol. 32, no.1 pp. 387-395, 2014, [Online], http://proceedings.mlr.press/v32/silver14.html.
C. C. White III and D. J. White, "Markov decision processes," European Journal of Operational Research, vol. 39, no. 1, pp. 1-16, 1989, DOI: 10.1016/0377-2217(89)90348-2.
G. Farneback, "Two-frame motion estimation based on polynomial expansion," Scandinavian conference on Image analysis, pp. 363-370. 2003, DOI: 10.1007/3-540-45103-X_50.
F. Dellaert, D. Fox, W. Burgard, and S. Thrun, "Monte Carlo localization for mobile robots," 1999 IEEE International Conference on Robotics and Automation, vol. 2, pp. 1322-1328, 1999, DOI: 10.1109/ROBOT.1999.772544.
A. Poroykov, P. Kalugin, S. Shitov, and I. Lapitskaya, "Modeling ArUco Markers Images for Accuracy Analysis of Their 3D Pose Estimation," 30th International Conference on Computer Graphics and Machine Vision, vol. 2, 2020, DOI: 10.51130/graphicon-2020-2-4-14.

The Journal of Korea Robotics Society (로봇학회논문지)

Learning Optimal Trajectory Generation for Low-Cost Redundant Manipulator using Deep Deterministic Policy Gradient(DDPG)

저가 Redundant Manipulator의 최적 경로 생성을 위한 Deep Deterministic Policy Gradient(DDPG) 학습

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)