Search | Korea Science

Comparison of value-based Reinforcement Learning Algorithms in Cart-Pole Environment

Byeong-Chan Han;Ho-Chan Kim;Min-Jae Kang
- International Journal of Internet, Broadcasting and Communication
- /
- v.15 no.3
- /
- pp.166-175
- /
- 2023
Reinforcement learning can be applied to a wide variety of problems. However, the fundamental limitation of reinforcement learning is that it is difficult to derive an answer within a given time because the problems in the real world are too complex. Then, with the development of neural network technology, research on deep reinforcement learning that combines deep learning with reinforcement learning is receiving lots of attention. In this paper, two types of neural networks are combined with reinforcement learning and their characteristics were compared and analyzed with existing value-based reinforcement learning algorithms. Two types of neural networks are FNN and CNN, and existing reinforcement learning algorithms are SARSA and Q-learning.
https://doi.org/10.7236/IJIBC.2023.15.3.166 인용 PDF

Reinforcement Learning using Propagation of Goal-State-Value (목표상태 값 전파를 이용한 강화 학습)

Kim, Byeong-Cheon;Yun, Byeong-Ju
- The Transactions of the Korea Information Processing Society
- /
- v.6 no.5
- /
- pp.1303-1311
- /
- 1999
In order to learn in dynamic environments, reinforcement learning algorithms like Q-learning, TD(0)-learning, TD(λ)-learning have been proposed. however, most of them have a drawback of very slow learning because the reinforcement value is given when they reach their goal state. In this thesis, we have proposed a reinforcement learning method that can approximate fast to the goal state in maze environments. The proposed reinforcement learning method is separated into global learning and local learning, and then it executes learning. Global learning is a learning that uses the replacing eligibility trace method to search the goal state. In local learning, it propagates the goal state value that has been searched through global learning to neighboring sates, and then searches goal state in neighboring states. we can show through experiments that the reinforcement learning method proposed in this thesis can find out an optimal solution faster than other reinforcement learning methods like Q-learning, TD(o)learning and TD(λ)-learning.
PDF

Online Reinforcement Learning to Search the Shortest Path in Maze Environments (미로 환경에서 최단 경로 탐색을 위한 실시간 강화 학습)

Kim, Byeong-Cheon;Kim, Sam-Geun;Yun, Byeong-Ju
- The KIPS Transactions:PartB
- /
- v.9B no.2
- /
- pp.155-162
- /
- 2002
Reinforcement learning is a learning method that uses trial-and-error to perform Learning by interacting with dynamic environments. It is classified into online reinforcement learning and delayed reinforcement learning. In this paper, we propose an online reinforcement learning system (ONRELS : Outline REinforcement Learning System). ONRELS updates the estimate-value about all the selectable (state, action) pairs before making state-transition at the current state. The ONRELS learns by interacting with the compressed environments through trial-and-error after it compresses the state space of the mage environments. Through experiments, we can see that ONRELS can search the shortest path faster than Q-learning using TD-ewor and $Q(\lambda{)}$-learning using $TD(\lambda{)}$ in the maze environments.
https://doi.org/10.3745/KIPSTB.2002.9B.2.155 인용 PDF KSCI

Reinforcement Learning Using State Space Compression (상태 공간 압축을 이용한 강화학습)

Kim, Byeong-Cheon;Yun, Byeong-Ju
- The Transactions of the Korea Information Processing Society
- /
- v.6 no.3
- /
- pp.633-640
- /
- 1999
Reinforcement learning performs learning through interacting with trial-and-error in dynamic environment. Therefore, in dynamic environment, reinforcement learning method like Q-learning and TD(Temporal Difference)-learning are faster in learning than the conventional stochastic learning method. However, because many of the proposed reinforcement learning algorithms are given the reinforcement value only when the learning agent has reached its goal state, most of the reinforcement algorithms converge to the optimal solution too slowly. In this paper, we present COMREL(COMpressed REinforcement Learning) algorithm for finding the shortest path fast in a maze environment, select the candidate states that can guide the shortest path in compressed maze environment, and learn only the candidate states to find the shortest path. After comparing COMREL algorithm with the already existing Q-learning and Priortized Sweeping algorithm, we could see that the learning time shortened very much.
PDF

A Study on Performance Improvement of Evolutionary Algorithms Using Reinforcement Learning (강화학습을 이용한 진화 알고리즘의 성능개선에 대한 연구)

이상환;심귀보
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 1998.10a
- /
- pp.420-426
- /
- 1998
Evolutionary algorithms are probabilistic optimization algorithms based on the model of natural evolution. Recently the efforts to improve the performance of evolutionary algorithms have been made extensively. In this paper, we introduce the research for improving the convergence rate and search faculty of evolution algorithms by using reinforcement learning. After providing an introduction to evolution algorithms and reinforcement learning, we present adaptive genetic algorithms, reinforcement genetic programming, and reinforcement evolution strategies which are combined with reinforcement learning. Adaptive genetic algorithms generate mutation probabilities of each locus by interacting with the environment according to reinforcement learning. Reinforcement genetic programming executes crossover and mutation operations based on reinforcement and inhibition mechanism of reinforcement learning. Reinforcement evolution strategies use the variances of fitness occurred by mutation to make the reinforcement signals which estimate and control the step length.
PDF

Goal-Directed Reinforcement Learning System (목표지향적 강화학습 시스템)

Lee, Chang-Hoon
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.10 no.5
- /
- pp.265-270
- /
- 2010
Reinforcement learning performs learning through interacting with trial-and-error in dynamic environment. Therefore, in dynamic environment, reinforcement learning method like TD-learning and TD(${\lambda}$)-learning are faster in learning than the conventional stochastic learning method. However, because many of the proposed reinforcement learning algorithms are given the reinforcement value only when the learning agent has reached its goal state, most of the reinforcement algorithms converge to the optimal solution too slowly. In this paper, we present GDRLS algorithm for finding the shortest path faster in a maze environment. GDRLS is select the candidate states that can guide the shortest path in maze environment, and learn only the candidate states to find the shortest path. Through experiments, we can see that GDRLS can search the shortest path faster than TD-learning and TD(${\lambda}$)-learning in maze environment.
PDF KSCI

Path Planning of Unmanned Aerial Vehicle based Reinforcement Learning using Deep Q Network under Simulated Environment (시뮬레이션 환경에서의 DQN을 이용한 강화 학습 기반의 무인항공기 경로 계획)

Lee, Keun Hyoung;Kim, Shin Dug
- Journal of the Semiconductor & Display Technology
- /
- v.16 no.3
- /
- pp.127-130
- /
- 2017
In this research, we present a path planning method for an autonomous flight of unmanned aerial vehicles (UAVs) through reinforcement learning under simulated environment. We design the simulator for reinforcement learning of uav. Also we implement interface for compatibility of Deep Q-Network(DQN) and simulator. In this paper, we perform reinforcement learning through the simulator and DQN, and use Q-learning algorithm, which is a kind of reinforcement learning algorithms. Through experimentation, we verify performance of DQN-simulator. Finally, we evaluated the learning results and suggest path planning strategy using reinforcement learning.
PDF

A Function Approximation Method for Q-learning of Reinforcement Learning (강화학습의 Q-learning을 위한 함수근사 방법)

이영아;정태충
- Journal of KIISE:Software and Applications
- /
- v.31 no.11
- /
- pp.1431-1438
- /
- 2004
Reinforcement learning learns policies for accomplishing a task's goal by experience through interaction between agent and environment. Q-learning, basis algorithm of reinforcement learning, has the problem of curse of dimensionality and slow learning speed in the incipient stage of learning. In order to solve the problems of Q-learning, new function approximation methods suitable for reinforcement learning should be studied. In this paper, to improve these problems, we suggest Fuzzy Q-Map algorithm that is based on online fuzzy clustering. Fuzzy Q-Map is a function approximation method suitable to reinforcement learning that can do on-line teaming and express uncertainty of environment. We made an experiment on the mountain car problem with fuzzy Q-Map, and its results show that learning speed is accelerated in the incipient stage of learning.
PDF KSCI

Adapative Modular Q-Learning for Agents´ Dynamic Positioning in Robot Soccer Simulation

Kwon, Ki-Duk;Kim, In-Cheol
- 제어로봇시스템학회:학술대회논문집
- /
- 2001.10a
- /
- pp.149.5-149
- /
- 2001
The robot soccer simulation game is a dynamic multi-agent environment. In this paper we suggest a new reinforcement learning approach to each agent´s dynamic positioning in such dynamic environment. Reinforcement learning is the machine learning in which an agent learns from indirect, delayed reward an optimal policy to choose sequences of actions that produce the greatest cumulative reward. Therefore the reinforcement learning is different from supervised learning in the sense that there is no presentation of input-output pairs as training examples. Furthermore, model-free reinforcement learning algorithms like Q-learning do not require defining or learning any models of the surrounding environment. Nevertheless ...
PDF

Predicting bond strength of corroded reinforcement by deep learning

Tanyildizi, Harun
- Computers and Concrete
- /
- v.29 no.3
- /
- pp.145-159
- /
- 2022
In this study, the extreme learning machine and deep learning models were devised to estimate the bond strength of corroded reinforcement in concrete. The six inputs and one output were used in this study. The compressive strength, concrete cover, bond length, steel type, diameter of steel bar, and corrosion level were selected as the input variables. The results of bond strength were used as the output variable. Moreover, the Analysis of variance (Anova) was used to find the effect of input variables on the bond strength of corroded reinforcement in concrete. The prediction results were compared to the experimental results and each other. The extreme learning machine and the deep learning models estimated the bond strength by 99.81% and 99.99% accuracy, respectively. This study found that the deep learning model can be estimated the bond strength of corroded reinforcement with higher accuracy than the extreme learning machine model. The Anova results found that the corrosion level was found to be the input variable that most affects the bond strength of corroded reinforcement in concrete.
https://doi.org/10.12989/cac.2022.29.3.145 인용 KSCI

Search Result 849, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)