Search | Korea Science

Kernel-based actor-critic approach with applications

Chu, Baek-Suk;Jung, Keun-Woo;Park, Joo-Young
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.11 no.4
- /
- pp.267-274
- /
- 2011
Recently, actor-critic methods have drawn significant interests in the area of reinforcement learning, and several algorithms have been studied along the line of the actor-critic strategy. In this paper, we consider a new type of actor-critic algorithms employing the kernel methods, which have recently shown to be very effective tools in the various fields of machine learning, and have performed investigations on combining the actor-critic strategy together with kernel methods. More specifically, this paper studies actor-critic algorithms utilizing the kernel-based least-squares estimation and policy gradient, and in its critic's part, the study uses a sliding-window-based kernel least-squares method, which leads to a fast and efficient value-function-estimation in a nonparametric setting. The applicability of the considered algorithms is illustrated via a robot locomotion problem and a tunnel ventilation control problem.
https://doi.org/10.5391/IJFIS.2011.11.4.267 인용 PDF KSCI

Control of Crawling Robot using Actor-Critic Fuzzy Reinforcement Learning (액터-크리틱 퍼지 강화학습을 이용한 기는 로봇의 제어)

Moon, Young-Joon;Lee, Jae-Hoon;Park, Joo-Young
- Journal of the Korean Institute of Intelligent Systems
- /
- v.19 no.4
- /
- pp.519-524
- /
- 2009
Recently, reinforcement learning methods have drawn much interests in the area of machine learning. Dominant approaches in researches for the reinforcement learning include the value-function approach, the policy search approach, and the actor-critic approach, among which pertinent to this paper are algorithms studied for problems with continuous states and continuous actions along the line of the actor-critic strategy. In particular, this paper focuses on presenting a method combining the so-called ACFRL(actor-critic fuzzy reinforcement learning), which is an actor-critic type reinforcement learning based on fuzzy theory, together with the RLS-NAC which is based on the RLS filters and natural actor-critic methods. The presented method is applied to a control problem for crawling robots, and some results are reported from comparison of learning performance.
https://doi.org/10.5391/JKIIS.2009.19.4.519 인용 PDF KSCI

Actor-Critic Algorithm with Transition Cost Estimation

Sergey, Denisov;Lee, Jee-Hyong
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.16 no.4
- /
- pp.270-275
- /
- 2016
We present an approach for acceleration actor-critic algorithm for reinforcement learning with continuous action space. Actor-critic algorithm has already proved its robustness to the infinitely large action spaces in various high dimensional environments. Despite that success, the main problem of the actor-critic algorithm remains the same-speed of convergence to the optimal policy. In high dimensional state and action space, a searching for the correct action in each state takes enormously long time. Therefore, in this paper we suggest a search accelerating function that allows to leverage speed of algorithm convergence and reach optimal policy faster. In our method, we assume that actions may have their own distribution of preference, that independent on the state. Since in the beginning of learning agent act randomly in the environment, it would be more efficient if actions were taken according to the some heuristic function. We demonstrate that heuristically-accelerated actor-critic algorithm learns optimal policy faster, using Educational Process Mining dataset with records of students' course learning process and their grades.
https://doi.org/10.5391/IJFIS.2016.16.4.270 인용 PDF KSCI

A new method for automatic areal feature matching based on shape similarity using CRITIC method (CRITIC 방법을 이용한 형상유사도 기반의 면 객체 자동매칭 방법)

Kim, Ji-Young;Huh, Yong;Kim, Doe-Sung;Yu, Ki-Yun
- Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
- /
- v.29 no.2
- /
- pp.113-121
- /
- 2011
In this paper, we proposed the method automatically to match areal feature based on similarity using spatial information. For this, we extracted candidate matching pairs intersected between two different spatial datasets, and then measured a shape similarity, which is calculated by an weight sum method of each matching criterion automatically derived from CRITIC method. In this time, matching pairs were selected when similarity is more than a threshold determined by outliers detection of adjusted boxplot from training data. After applying this method to two distinct spatial datasets: a digital topographic map and street-name address base map, we conformed that buildings were matched, that shape is similar and a large area is overlaid in visual evaluation, and F-Measure is highly 0.932 in statistical evaluation.
https://doi.org/10.7848/ksgpc.2011.29.2.113 인용 PDF KSCI

Improved Deep Q-Network Algorithm Using Self-Imitation Learning (Self-Imitation Learning을 이용한 개선된 Deep Q-Network 알고리즘)

Sunwoo, Yung-Min;Lee, Won-Chang
- Journal of IKEEE
- /
- v.25 no.4
- /
- pp.644-649
- /
- 2021
Self-Imitation Learning is a simple off-policy actor-critic algorithm that makes an agent find an optimal policy by using past good experiences. In case that Self-Imitation Learning is combined with reinforcement learning algorithms that have actor-critic architecture, it shows performance improvement in various game environments. However, its applications are limited to reinforcement learning algorithms that have actor-critic architecture. In this paper, we propose a method of applying Self-Imitation Learning to Deep Q-Network which is a value-based deep reinforcement learning algorithm and train it in various game environments. We also show that Self-Imitation Learning can be applied to Deep Q-Network to improve the performance of Deep Q-Network by comparing the proposed algorithm and ordinary Deep Q-Network training results.
https://doi.org/10.7471/ikeee.2021.25.4.644 인용 PDF KSCI

Adaptive Actor-Critic Learning of Mobile Robots Using Actual and Simulated Experiences

Rafiuddin Syam;Keigo Watanabe;Kiyotaka Izumi;Kazuo Kiguchi;Jin, Sang-Ho
- 제어로봇시스템학회:학술대회논문집
- /
- 2001.10a
- /
- pp.43.6-43
- /
- 2001
In this paper, we describe an actor-critic method as a kind of temporal difference (TD) algorithms. The value function is regarded as a current estimator, in which two value functions have different inputs: one is an actual experience; the other is a simulated experience obtained through a predictive model. Thus, the parameter´s updating for the actor and critic parts is based on actual and simulated experiences, where the critic is constructed by a radial-basis function neural network (RBFNN) and the actor is composed of a kinematic-based controller. As an example application of the present method, a tracking control problem for the position coordinates and azimuth of a nonholonomic mobile robot is considered. The effectiveness is illustrated by a simulation.
PDF

Robot Locomotion via RLS-based Actor-Critic Learning (RLS 기반 Actor-Critic 학습을 이용한 로봇이동)

Kim, Jong-Ho;Kang, Dae-Sung;Park, Joo-Young
- Journal of the Korean Institute of Intelligent Systems
- /
- v.15 no.7
- /
- pp.893-898
- /
- 2005
Due to the merits that only a small amount of computation is needed for solutions and stochastic policies can be handled explicitly, the actor-critic algorithm, which is a class of reinforcement learning methods, has recently attracted a lot of interests in the area of artificial intelligence. The actor-critic network composes of tile actor network for selecting control inputs and the critic network for estimating value functions, and in its training stage, the actor and critic networks take the strategy, of changing their parameters adaptively in order to select excellent control inputs and yield accurate approximation for value functions as fast as possible. In this paper, we consider a new actor-critic algorithm employing an RLS(Recursive Least Square) method for critic learning, and policy gradients for actor learning. The applicability of the considered algorithm is illustrated with experiments on the two linked robot arm.
https://doi.org/10.5391/JKIIS.2005.15.7.893 인용 PDF KSCI

Analysis of Reinforcement Learning Methods for BS Switching Operation (기지국 상태 조정을 위한 강화 학습 기법 분석)

Park, Hyebin;Lim, Yujin
- Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
- /
- v.8 no.2
- /
- pp.351-358
- /
- 2018
Reinforcement learning is a machine learning method which aims to determine a policy to get optimal actions in dynamic and stochastic environments. But reinforcement learning has high computational complexity and needs a lot of time to get solution, so it is not easily applicable to uncertain and continuous environments. To tackle the complexity problem, AC (actor-critic) method is used and it separates an action-value function into a value function and an action decision policy. Also, in transfer learning method, the knowledge constructed in one environment is adapted to another environment, so it reduces the time to learn in a reinforcement learning method. In this paper, we present AC method and transfer learning method to solve the problem of a reinforcement learning method. Finally, we analyze the case study which a transfer learning method is used to solve BS(base station) switching problem in wireless access networks.
https://doi.org/10.21742/AJMAHS.2018.02.32 인용

A Study on Evaluation of the Priority Orders for the Establishment of Maritime Courts Using Maritime Casualties Counts Based on Integrated ELECTRE-CRITIC-ISM (통합 ELECTRE-CRITIC-ISM법 기반 해양사고 발생건수를 이용한 해사법원 설치 우선순위 평가에 관한 연구)

Jang, Woon-Jae
- Journal of the Korean Society of Marine Environment & Safety
- /
- v.26 no.6
- /
- pp.624-633
- /
- 2020
In those day, Incheon and Busan local government are arguing about establishment of a maritime court. This study aims to develop a model that evaluates the priority orders for the establishment of maritime courts using maritime casualties counts based on the integrated ELECTRE-CRITIC-ISM technique, as well as to verify its usefulness in the establishment of maritime courts in Korea. For this purpose, a total of 22 ports, excluding nine ports where maritime accident data were integrated and managed among the 31 international trade ports, were matched with the jurisdiction of six alternative high courts. Second, the CRITIC method was used to calculate the weights of the number of maritime casualties during a 5-year period that were evaluation factors and combine with the ELECTRE method. Finally, the ELECTRE&ISM method was used to analyze the concordance and discordance between high courts and evaluate the priority orders considering the fluctuations of maritime casualties counts. As the final evaluation result considering the mean values in fluctuations of maritime casualties counts, the Busan High Court ranked first, the Gwangju high court second, the Seoul high court third, the Daejeon and the Daegu high court forth (equal), and the Suwon high court sith. Therefore, it is necessary to preferentially establish a maritime court in the jurisdiction of the Busan High Court.
https://doi.org/10.7837/kosomes.2020.26.6.624 인용 PDF KSCI

Suspension Control using Reinforcement Learning (강화학습에 의한 현가장치의 제어)

Jeong, Gyu-Baek;Mun, Yeong-Jun;Park, Ju-Yeong
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2007.11a
- /
- pp.163-166
- /
- 2007
최근에 국내외의 인공지능 분야에서는, 강화학습(reinforcement learning)에 관한 연구가 활발히 진행되고 있다. 본 논문에서는 능동형 현가장치(active-suspension)의 제어를 위하여 RLS 기반 NAC(natural actor-critic)을 활용한 강화학습 기법을 적용해보고, 그 성능을 시뮬레이션을 통해 확인해본다.
PDF

Search Result 38, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)