통합 검색 | Korea Science

강화학습의 신속한 학습을 위한 변이형 오토인코더 기반의 조립 특징 추출 네트워크 (Variational Autoencoder-based Assembly Feature Extraction Network for Rapid Learning of Reinforcement Learning)

윤준완;나민우;송재복
- 로봇학회논문지
- /
- 제18권3호
- /
- pp.352-357
- /
- 2023
Since robotic assembly in an unstructured environment is very difficult with existing control methods, studies using artificial intelligence such as reinforcement learning have been conducted. However, since long-time operation of a robot for learning in the real environment adversely affects the robot, so a method to shorten the learning time is needed. To this end, a method based on a pre-trained neural network was proposed in this study. This method showed a learning speed about 3 times than the existing methods, and the stability of reward during learning was also increased. Furthermore, it can generate a more optimal policy than not using a pre-trained neural network. Using the proposed reinforcement learning-based assembly trajectory generator, 100 attempts were made to assemble the power connector within a random error of 4.53 mm in width and 3.13 mm in length, resulting in 100 successes.
https://doi.org/10.7746/jkros.2023.18.3.352 인용 PDF

Design of Multiobjective Satisfactory Fuzzy Logic Controller using Reinforcement Learning

Kang, Dong-Oh;Zeungnam Bien
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2000년도 ITC-CSCC -2
- /
- pp.677-680
- /
- 2000
The technique of reinforcement learning algorithm is extended to solve the multiobjective control problem for uncertain dynamic systems. A multiobjective adaptive critic structure is proposed in order to realize a max-min method in the reinforcement learning process. Also, the proposed reinforcement learning technique is applied to a multiobjective satisfactory fuzzy logic controller design in which fuzzy logic subcontrollers are assumed to be derived from human experts. Some simulation results are given in order to show effectiveness of the proposed method.
PDF

Performance Improvement of Evolution Strategies using Reinforcement Learning

Sim, Kwee-Bo;Chun, Ho-Byung
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- 제1권1호
- /
- pp.125-130
- /
- 2001
In this paper, we propose a new type of evolution strategies combined with reinforcement learning. We use the variances of fitness occurred by mutation to make the reinforcement signals which estimate and control the step length of mutation. With this proposed method, the convergence rate is improved. Also, we use cauchy distributed mutation to increase global convergence faculty. Cauchy distributed mutation is more likely to escape from a local minimum or move away from a plateau. After an outline of the history of evolution strategies, it is explained how evolution strategies can be combined with the reinforcement learning, named reinforcement evolution strategies. The performance of proposed method will be estimated by comparison with conventional evolution strategies on several test problems.
PDF

Comparison of Reinforcement Learning Activation Functions to Improve the Performance of the Racing Game Learning Agent

Lee, Dongcheul
- Journal of Information Processing Systems
- /
- 제16권5호
- /
- pp.1074-1082
- /
- 2020
Recently, research has been actively conducted to create artificial intelligence agents that learn games through reinforcement learning. There are several factors that determine performance when the agent learns a game, but using any of the activation functions is also an important factor. This paper compares and evaluates which activation function gets the best results if the agent learns the game through reinforcement learning in the 2D racing game environment. We built the agent using a reinforcement learning algorithm and a neural network. We evaluated the activation functions in the network by switching them together. We measured the reward, the output of the advantage function, and the output of the loss function while training and testing. As a result of performance evaluation, we found out the best activation function for the agent to learn the game. The difference between the best and the worst was 35.4%.
https://doi.org/10.3745/JIPS.02.0141 인용 PDF KSCI

A Reinforcement Learning with CMAC

Kwon, Sung-Gyu
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- 제6권4호
- /
- pp.271-276
- /
- 2006
To implement a generalization of value functions in Adaptive Search Element (ASE)-reinforcement learning, CMAC (Cerebellar Model Articulation Controller) is integrated into ASE controller. ASE-reinforcement learning scheme is briefly studied to discuss how CMAC is integrated into ASE controller. Neighbourhood Sequential Training for CMAC is utilized to establish the look-up table and to produce discrete control outputs. In computer simulation, an ASE controller and a couple of ASE-CMAC neural network are trained to balance the inverted pendulum on a cart. The number of trials until the controllers are established and the learning performance of the controllers are evaluated to find that generalization ability of the CMAC improves the speed of the ASE-reinforcement learning enough to realize the cartpole control system.
https://doi.org/10.5391/IJFIS.2006.6.4.271 인용 PDF KSCI

Controller Learning Method of Self-driving Bicycle Using State-of-the-art Deep Reinforcement Learning Algorithms

Choi, Seung-Yoon;Le, Tuyen Pham;Chung, Tae-Choong
- 한국컴퓨터정보학회논문지
- /
- 제23권10호
- /
- pp.23-31
- /
- 2018
Recently, there have been many studies on machine learning. Among them, studies on reinforcement learning are actively worked. In this study, we propose a controller to control bicycle using DDPG (Deep Deterministic Policy Gradient) algorithm which is the latest deep reinforcement learning method. In this paper, we redefine the compensation function of bicycle dynamics and neural network to learn agents. When using the proposed method for data learning and control, it is possible to perform the function of not allowing the bicycle to fall over and reach the further given destination unlike the existing method. For the performance evaluation, we have experimented that the proposed algorithm works in various environments such as fixed speed, random, target point, and not determined. Finally, as a result, it is confirmed that the proposed algorithm shows better performance than the conventional neural network algorithms NAF and PPO.
https://doi.org/10.9708/jksci.2018.23.10.023 인용 PDF KSCI

Reinforcement Learning Approach to Agents Dynamic Positioning in Robot Soccer Simulation Games

Kwon, Ki-Duk;Kim, In-Cheol
- 한국시뮬레이션학회:학술대회논문집
- /
- 한국시뮬레이션학회 2001년도 The Seoul International Simulation Conference
- /
- pp.321-324
- /
- 2001
The robot soccer simulation game is a dynamic multi-agent environment. In this paper we suggest a new reinforcement learning approach to each agent's dynamic positioning in such dynamic environment. Reinforcement Beaming is the machine learning in which an agent learns from indirect, delayed reward an optimal policy to choose sequences of actions that produce the greatest cumulative reward. Therefore the reinforcement loaming is different from supervised teaming in the sense that there is no presentation of input-output pairs as training examples. Furthermore, model-free reinforcement loaming algorithms like Q-learning do not require defining or loaming any models of the surrounding environment. Nevertheless it can learn the optimal policy if the agent can visit every state-action pair infinitely. However, the biggest problem of monolithic reinforcement learning is that its straightforward applications do not successfully scale up to more complex environments due to the intractable large space of states. In order to address this problem, we suggest Adaptive Mediation-based Modular Q-Learning(AMMQL) as an improvement of the existing Modular Q-Learning(MQL). While simple modular Q-learning combines the results from each learning module in a fixed way, AMMQL combines them in a more flexible way by assigning different weight to each module according to its contribution to rewards. Therefore in addition to resolving the problem of large state space effectively, AMMQL can show higher adaptability to environmental changes than pure MQL. This paper introduces the concept of AMMQL and presents details of its application into dynamic positioning of robot soccer agents.
PDF

강화학습 기반 수평적 파드 오토스케일링 정책의 학습 가속화를 위한 전이학습 기법 (Transfer Learning Technique for Accelerating Learning of Reinforcement Learning-Based Horizontal Pod Autoscaling Policy)

장용현;유헌창;김성석
- 정보처리학회논문지:컴퓨터 및 통신 시스템
- /
- 제11권4호
- /
- pp.105-112
- /
- 2022
최근 환경의 변화에 적응적이고 특정 목적에 부합하는 오토스케일링 정책을 만들기 위해 강화학습 기반 오토스케일링을 사용하는 연구가 많이 이루어지고 있다. 하지만 실제 환경에서 강화학습 기반 수평적 파드 오토스케일러(HPA, Horizontal Pod Autoscaler)의 정책을 학습하기 위해서는 많은 비용과 시간이 요구되며, 서비스를 배포할 때마다 실제 환경에서 강화학습 기반 HPA 정책을 처음부터 다시 학습하는 것은 실용적이지 않다. 본 논문에서는 쿠버네티스에서 강화학습 기반 HPA를 구현하고, 강화학습 기반 HPA 정책에 대한 학습을 가속화하기 위해 대기행렬 모델 기반 시뮬레이션을 활용한 전이 학습 기법을 제안한다. 시뮬레이션을 활용한 사전 학습을 수행함으로써 실제 환경에서 시간과 자원을 소모하며 학습을 수행하지 않아도 시뮬레이션 경험을 통해 정책 학습이 이루어질 수 있도록 하였고, 전이 학습 기법을 사용함으로써 전이 학습 기법을 사용하지 않았을 때보다 약 42.6%의 비용을 절감할 수 있었다.
https://doi.org/10.3745/KTCCS.2022.11.4.105 인용 PDF KSCI

강화학습 기법과 메타학습을 이용한 기는 로봇의 이동 (Locomotion of Crawling Robots Based on Reinforcement Learning and Meta-Learning)

문영준;정규백;박주영
- 한국지능시스템학회:학술대회논문집
- /
- 한국지능시스템학회 2007년도 추계학술대회 학술발표 논문집
- /
- pp.395-398
- /
- 2007
최근 인공지능 분야에서는 강화학습(Reinforcement Learning)에 대한 관심이 크게 증폭되고 있으며, 여러 관련 분야에 적용되고 있다. 본 논문에서는 강화학습 기법 중 액터-크리틱 계열에 속하는 RLS-NAC 알고리즘을 활용하여 Kimura의 기는 로봇의 이동을 다룰 때에 중요 파라미터의 결정을 위하여 meta-learning 기법을 활용하는 방안에 고려한다.
PDF

강화 학습에 의한 소형 자율 이동 로봇의 협동 알고리즘 구현 (A reinforcement learning-based method for the cooperative control of mobile robots)

김재희;조재승;권인소
- 제어로봇시스템학회:학술대회논문집
- /
- 제어로봇시스템학회 1997년도 한국자동제어학술회의논문집; 한국전력공사 서울연수원; 17-18 Oct. 1997
- /
- pp.648-651
- /
- 1997
This paper proposes methods for the cooperative control of multiple mobile robots and constructs a robotic soccer system in which the cooperation will be implemented as a pass play of two robots. To play a soccer game, elementary actions such as shooting and moving have been designed, and Q-learning, which is one of the popular methods for reinforcement learning, is used to determine what actions to take. Through simulation, learning is successful in case of deliberate initial arrangements of ball and robots, thereby cooperative work can be accomplished.
PDF

검색결과 799건 처리시간 0.036초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)