[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7746/jkros.2022.17.2.221

Gain Tuning for SMCSPO of Robot Arm with Q-Learning

Lee, JinHyeok (School of Mechanical Engineering, Pusan National University)
Kim, JaeHyung (School of Mechanical Engineering, Pusan National University)
Lee, MinCheol (School of Mechanical Engineering, Pusan National University)

Publication Information

The Journal of Korea Robotics Society / v.17, no.2, 2022 , pp. 221-229 More about this Journal

Abstract

Sliding mode control (SMC) is a robust control method to control a robot arm with nonlinear properties. A high switching gain of SMC causes chattering problems, although the SMC allows the adequate control performance by giving high switching gain, without the exact robot model containing nonlinear and uncertainty terms. In order to solve this problem, SMC with sliding perturbation observer (SMCSPO) has been researched, where the method can reduce the chattering by compensating the perturbation, which is estimated by the observer, and then choosing a lower switching control gain of SMC. However, optimal gain tuning is necessary to get a better tracking performance and reducing a chattering. This paper proposes a method that the Q-learning automatically tunes the control gains of SMCSPO with an iterative operation. In this tuning method, the rewards of reinforcement learning (RL) are set minus tracking errors of states, and the action of RL is a change of control gain to maximize rewards whenever the iteration number of movements increases. The simple motion test for a 7-DOF robot arm was simulated in MATLAB program to prove this RL tuning algorithm. The simulation showed that this method can automatically tune the control gains for SMCSPO.

Keywords

Robust Control; Reinforcement Learning; Q-Learning; Sliding Mode Control; Auto-Tuning;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	M. G. Jung and M. C. Lee, "Study on Robust Control of Industrial Manipulator for Assembly Based on SMCSPO," Journal of Institute of Control, Robotics and Systems, vol. 24, no. 6, pp. 552-560, 2018, DOI: 10.5302/J.ICROS.2018. 18.0034. DOI
2	S. H. Han, H. C. Cho, and K. S. Lee, "Position Control of Nonlinear Crane Systems using Dynamic Neural Network," Trans. Korean. Inst. Elect. Eng., vol. 56, no. 5, pp. 966-972, 2007, [Online], https://www.dbpia.co.kr/journal/articleDetail? nodeId=NODE01280780.
3	K. D. Young, V. I. Utkin, and U. Ozguner, "A control engineer's guide to sliding mode control," IEEE Transactions on Control Systems Technology, vol. 7, no. 3, pp. 328-342, 1999, DOI: 10.1109/87.761053. DOI
4	V. Utkin and H. Lee, "Chattering Problem in Sliding Mode Control Systems," International Workshop on Variable Structure Systems, 2006. VSS'06, Alghero, Italy, pp. 346-350, 2006, DOI: 10.1109/VSS.2006.1644542. DOI
5	J. T. Moura, H. Elmali, and N. Olgac, "Sliding Mode Control with Sliding Perturbation Observer," ASME. J. Dyn. Sys., Meas., Control, vol. 119, no. 4, pp. 657-665, 1997. DOI
6	D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, "Deterministic Policy Gradient Algorithms," 31st International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 32, no, 1, pp. 387-395, 2014, [Online], http://proceedings.mlr.press/v32/silver14.html.
7	T.-C. Kuo, Y.-J. Huang, and S.-H. Chang, "Sliding mode control with self-tuning law for uncertain nonlinear systems," ISA Transactions, vol. 47, no. 2, pp. 171-178, April, 2008, DOI: 10.1016/j.isatra.2007.10.001. DOI
8	Y. Yu, Z. Cao, S. Liang, Z. Liu, J. Yu, and X. Chen, "A Grasping CNN with Image Segmentation for Mobile Manipulating Robot," 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), Yunnan, China, pp. 1688-1692, 2019, DOI: 10.1109/ROBIO49542.2019. 8961 427. DOI
9	W. J. Shipman and L. C. Coetzee, "Reinforcement Learning and Deep Neural Networks for PI Controller Tuning," IFAC-Papers OnLine, vol. 52, no. 14, pp. 111-116, 2019, DOI: 10.1016/j.ifacol.2019.09.173. DOI
10	C. Szepesvari, "Algorithms for reinforcement learning," Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 4, no. 1, pp. 1-103, 2010, DOI: 10.2200/S00268ED 1V01Y201005AIM009. DOI
11	Y. Ansari, E. Falotico, Y. Mollard, B. Busch, M. Cianchetti, and C. Laschi, "A Multiagent Reinforcement Learning approach for inverse kinematics of high dimensional manipulators with precision positioning," 2016 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob), University Town, Singapore, pp. 457-463, 2016, DOI: 10.1109/BIOROB.2016.7523669. DOI
12	H. H. Kim, H. Khan, Y. J. An, and M. C. Lee, "Development of Reinforcement Learning Assembly Algorithm Based on Estimated Reaction Force Using Sliding Perturbation Observer," International Conference on Control, Automation and System, Busan, Korea, pp. 1018-1021, 2020, [Online], https://www.dbpia.co.kr/Journal/articleDetail?nodeId=NODE10493699.
13	J. M. Hollerbach, "A Recursive Lagrangian Formulation of Maniputator Dynamics and a Comparative Study of Dynamics Formulation Complexity," IEEE Transactions on Systems, Man, and Cybernetics, vol. 10, no. 11, pp. 730-736, Nov., 1980, DOI: 10.1109/TSMC.1980.4308393. DOI
14	K. S. You, M. C. Lee, and W. S. Yoo, "Sliding mode controller with sliding perturbation observer based on gain optimization using genetic algorithm," KSME International Journal, vol. 18, no. 4, pp. 630-639, 2004, DOI: 10.1007/BF02983647. DOI
15	C. J. Watkins and P. Dayan. "Q-learning," Machine Learning, vol. 8, pp. 279-292, 1992, [Online], http://www.gatsby.ucl.ac.uk/~dayan/papers/wd92.html. DOI
16	J.-J. Slotine, J. K. Hedrick, and E. A. Misawa, "On Sliding Observers for Non-Linear Systems," ASME Journal of Dynamic Systems, Measurement, and Control, vol. 109, no. 3, pp. 245-252, 1987, DOI: 10.23919/ACC.1986.4789217. DOI
17	E. Rodrigues Gomes and R. Kowalczyk. "Dynamic analysis of multiagent Q-learning with ε-greedy exploration," Proceedings of the 26th Annual International Conference on Machine Learning (ICML '09). Association for Computing Machinery, New York, USA, pp. 369-376, 2009, DOI: 10.1145/1553374.1553422. DOI
18	T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, D. Horgan, J. Quan, A. Sendonaris, G. Dulac-Arnold, I. Osband, J. Agapiou, J. Z. Leibo, and A. Gruslys, "Deep q-learning from demonstrations," arXiv:1704.03732 [cs.AI], 2018, DOI: 10.48550/arXiv.1704.03732. DOI
19	W. Jouini, D. Ernst, C. Moy and J. Palicot, "Upper Confidence Bound Based Decision Making Strategies and Dynamic Spectrum Access," 2010 IEEE International Conference on Communications, Cape Town, South Africa, pp. 1-5, 2010, DOI: 10.1109/ICC.2010.5502014. DOI
20	K.-G. Cha, S. M. Yoon, and M. C. Lee, "SPO based Reaction Force Estimation and Force Reflection Bilateral Control of Cylinder for Tele-Dismantling," Journal of Korea Robotics Society, vol. 12, no. 1, pp. 1-10, March, 2017, DOI: 10.7746/jkros.2017.12.1.001. DOI
21	F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, "Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers," IEEE Control Systems Magazine, vol. 32, no. 6, pp. 76-105, Dec. 2012, DOI: 10.1109/MCS.2012.2214134. DOI

KSCI

Gain Tuning for SMCSPO of Robot Arm with Q-Learning Q-Learning을 사용한 로봇팔의 SMCSPO 게인 튜닝

Gain Tuning for SMCSPO of Robot Arm with Q-Learning