Control of Crawling Robot using Actor-Critic Fuzzy Reinforcement Learning

Moon, Young-Joon;Lee, Jae-Hoon;Park, Joo-Young;

doi:10.5391/JKIIS.2009.19.4.519

한국지능시스템학회논문지 (Journal of the Korean Institute of Intelligent Systems)

제19권4호
/
Pages.519-524
/
2009
/
1976-9172(pISSN)
/
2288-2324(eISSN)

한국지능시스템학회 (Korean Institute of Intelligent Systems)

DOI QR Code

액터-크리틱 퍼지 강화학습을 이용한 기는 로봇의 제어

Control of Crawling Robot using Actor-Critic Fuzzy Reinforcement Learning

문영준 (대우조선해양 미래연구소) ;
이재훈 (고려대학교 제어계측공학과) ;
박주영 (고려대학교 제어계측공학과)

투고 : 2009.04.06
심사 : 2009.06.26
발행 : 2009.08.25

https://doi.org/10.5391/JKIIS.2009.19.4.519 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

최근에 강화학습 기법은 기계학습 분야에서 많은 관심을 끌어왔다. 강화학습 관련 연구에서 가장 유력하게 사용되어 온 방법들로는 가치함수를 활용하는 기법, 제어규칙(policy) 탐색 기법 및 액터-크리틱 기법 등이 있는데, 본 논문에서는 이들 중 연속 상태 및 연속 입력을 갖는 문제를 위하여 액터-크리틱 기법의 틀에서 제안된 알고리즘들과 관련된 내용을 다룬다. 특히 본 논문은 퍼지 이론에 기반을 둔 액터-크리틱 계열 강화학습 기법인 ACFRL 알고리즘과, RLS 필터와 NAC(natural actor-critic) 기법에 기반을 둔 RLS-NAC 기법을 접목하는 방안을 집중적으로 고찰한다. 고찰된 방법론은 기는 로봇의 제어문제에 적용되고, 학습 성능의 비교로부터 얻어진 몇 가지 결과가 보고된다.

Recently, reinforcement learning methods have drawn much interests in the area of machine learning. Dominant approaches in researches for the reinforcement learning include the value-function approach, the policy search approach, and the actor-critic approach, among which pertinent to this paper are algorithms studied for problems with continuous states and continuous actions along the line of the actor-critic strategy. In particular, this paper focuses on presenting a method combining the so-called ACFRL(actor-critic fuzzy reinforcement learning), which is an actor-critic type reinforcement learning based on fuzzy theory, together with the RLS-NAC which is based on the RLS filters and natural actor-critic methods. The presented method is applied to a control problem for crawling robots, and some results are reported from comparison of learning performance.

키워드

참고문헌

Q. Yang, J. B. Vance, and S. Jagannathan, 'Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks,' IEEE Transactions on Systems, Man, and Cybernetics Part B: Cybernetics, vol. 38, no. 4, pp. 994-1001, 2008 https://doi.org/10.1109/TSMCB.2008.926607
J. Valasek, J. Doebbler, M. D. Tandale, and A. J. Meade, 'Improved adaptive-reinforcement learning control for morphing unmanned air vehicles,' IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 38, no. 4, pp. 1014-1020, 2008 https://doi.org/10.1109/TSMCB.2008.922018
K.-H. Park, Y.-J. Kim, and J.-H. Kim, 'Modular Q-learning based multi-agent cooperation for robot soccer,' Robotics and Autonomous Systems, vol. 35, no. 2, pp. 109-122, 2001
J. Moody and M. Saffell, 'Learning to trade via direct reinforcement,' IEEE Transactions on Neural Networks, vol. 12, no. 4, pp. 875-889, 2001 https://doi.org/10.1109/72.935097
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, 1998
H. R. Berenji and D. Vengerov, 'A convergent actor-critic-based RFL algorithm with application to power management of wireless transmitters', IEEE Transactions on Fuzzy Systems, vol. 11, no. 4, August, 2003
X. Xu, H. He, and D. Hu, 'Efficient reinforcement learning using recursive least-squares methods', Journal of Artificial Intelligent Research, vol. 16, pp. 259-292, 2002
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, 'Policy gradient methods for reinforcement learning with function approximation', Advances in Neural Information Processing Systems, vol. 12, pp. 1057-1063, 2000
V. Konda and J. N. Tsitsiklis, 'Actor-Critic Algorithms', SIAM Journal on Control and Optimization, vol. 42. no. 4, pp. 1143-1166, 2003 https://doi.org/10.1137/S0363012901385691
J. Peters, S. Vijayakumar, and S. Schaal, 'Reinforcement learning for humanoid robotics', In Proceedings of the Third IEEE-RAS International Conference on Humanoid Robots, 2003
J. Park, J. Kim, and D. Kang. 'An RLS-based natural actor-critic algorithm for locomotion of a two-linked robot arm', Lecture Notes in Artificial Intelligence, vol. 3801, pp. 65-72, December, 2005
H. Kimura, K. Mivazaki, and S. Kobayashi, 'Reinforcement learning in POMDPs with function approximation', In Proceedings of the 14th International Conference on Machine Learning(ICML 1997), pp. 152-160, 1997
김종호, 강화학습 알고리즘을 이용한 시스템 제어에 대한 연구, 고려대학교 제어계측공학과 석사학위논문, 2005
L. X. Wang, Adaptive Fuzzy Systems and Control: Design and Stability Analysis, Prentice-Hall, 1994
박종진, 최규석, 퍼지 제어 시스템, 교우사, 2001
T. Takagi and M. Sugeno, 'Fuzzy identification of systems and its applications to modeling and control,' IEEE Transactions on Systems, Man, and Cybernetics, vol. 15, pp. 116-132, 1985
박주영, 정규백, 문영준, '강화학습에 의해 학습된 기는 로봇의 성능 비교', 한국 퍼지 및 지능시스템학회 논문집, 17권, 1호, pp. 33-36, 2007

한국지능시스템학회논문지 (Journal of the Korean Institute of Intelligent Systems)

액터-크리틱 퍼지 강화학습을 이용한 기는 로봇의 제어

Control of Crawling Robot using Actor-Critic Fuzzy Reinforcement Learning

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)