Comparison of value-based Reinforcement Learning Algorithms in Cart-Pole Environment

Byeong-Chan Han;Ho-Chan Kim;Min-Jae Kang;

doi:10.7236/IJIBC.2023.15.3.166

International Journal of Internet, Broadcasting and Communication

제15권3호
/
Pages.166-175
/
2023
/
2288-4920(pISSN)
/
2288-4939(eISSN)

한국인터넷방송통신학회 (The Institute of Internet, Broadcasting and Communication)

DOI QR Code

Comparison of value-based Reinforcement Learning Algorithms in Cart-Pole Environment

Byeong-Chan Han (Dept. of Electronic Engineering, Jeju National University) ;
Ho-Chan Kim (Dept. of Electrical Engineering, Jeju National University) ;
Min-Jae Kang (Dept of Electronic Engineering, Jeju National University)

투고 : 2023.07.02
심사 : 2023.07.11
발행 : 2023.08.31

https://doi.org/10.7236/IJIBC.2023.15.3.166 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

Reinforcement learning can be applied to a wide variety of problems. However, the fundamental limitation of reinforcement learning is that it is difficult to derive an answer within a given time because the problems in the real world are too complex. Then, with the development of neural network technology, research on deep reinforcement learning that combines deep learning with reinforcement learning is receiving lots of attention. In this paper, two types of neural networks are combined with reinforcement learning and their characteristics were compared and analyzed with existing value-based reinforcement learning algorithms. Two types of neural networks are FNN and CNN, and existing reinforcement learning algorithms are SARSA and Q-learning.

키워드

과제정보

This research was supported by the 2023 scientific promotion program funded by Jeju National University

참고문헌

Richard Sutton and Andrew Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
BROCKMAN, Greg, et al. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
D. -H. Lee, V. V. Quang, S.Jo and J. -J. Lee, "Online Support Vector Regression based value function approximation for Reinforcement Learning," 2009 IEEE International Symposium on Industrial Electronics, Seoul, Korea (South), 2009, pp. 449-454, doi: https://doi.org/10.1109/ISIE.2009.5222726.
Rammohan, Sreehari, et al. "Value-Based Reinforcement Learning for Continuous Control Robotic Manipulation in Multi-Task Sparse Reward Settings." arXiv preprint arXiv:2107.13356(2021).
WATKINS, Christopher JCH; DAYAN, Peter. Q-learning. Machine learning, 1992, 8: 279-292. https://doi.org/10.1023/A:1022676722315
D. Pandey and P. Pandey, "Approximate Q-Learning: An Introduction," 2010 Second International Conference on Machine Learning and Computing, Bangalore, India, 2010, pp. 317-320, doi: https://doi.org/10.1109/ICMLC.2010.38.
N. Kantasewi, S. Marukatat, S. Thainimit and O. Manabu, "Multi Q-Table Q-Learning," 2019 10th International Conference of Information and Communication Technology for Embedded Systems(IC-ICTES), Bangkok, Thailand, 2019, pp. 1-7, doi: https://doi.org/10.1109/ICTEmSys.2019.8695963.
RUMMERY, Gavin A.; NIRANJAN, Mahesan. On-line Q-learning using connectionist systems. Cambridge, UK: University of Cambridge, Department of Engineering, 1994.
L. Harwin and S. P., "Comparison of SARSA algorithm and Temporal Difference Learning Algorithm for Robotic Path Planning for Static Obstacles," 2019 Third International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 2019, pp. 472-476, doi: https://doi.org/10.1109/ICISC44355.2019.9036354.
T. Lu, K. Zhang and Y. Shi, "Sarsa-based Model Predictive Control with Improved Performance and Computational Complexity," 2022 IEEE 5th International Conference on Industrial Cyber-Physical Systems (ICPS), Coventry, United Kingdom, 2022, pp. 01-06, doi: https://doi.org/10.1109/ICPS51978.2022.9816896.
MNIH, Volodymyr, et al. Human-level control through deep reinforcement learning. nature, 2015, 518.7540: 529-533. https://doi.org/10.1038/nature14236
L. Hou, Z. Wang and H. Long, "An Improvement for Value-Based Reinforcement Learning Method Through Increasing Discount Factor Substitution," 2021 IEEE 24th International Conference on Computational Science and Engineering (CSE), Shenyang, China, 2021, pp. 94-100, doi: https://doi.org/10.1109/CSE53436.2021.00023.

International Journal of Internet, Broadcasting and Communication

Comparison of value-based Reinforcement Learning Algorithms in Cart-Pole Environment

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)