Search | Korea Science

DQN Reinforcement Learning for Acrobot in OpenAI Gym Environment (OpenAI Gym 환경의 Acrobot에 대한 DQN 강화학습)

Myung-Ju Kang
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2023.07a
- /
- pp.35-36
- /
- 2023
본 논문에서는 OpenAI Gym 환경에서 제공하는 Acrobot-v1에 대해 DQN(Deep Q-Networks) 강화학습으로 학습시키고, 이 때 적용되는 활성화함수의 성능을 비교분석하였다. DQN 강화학습에 적용한 활성화함수는 ReLU, ReakyReLU, ELU, SELU 그리고 softplus 함수이다. 실험 결과 평균적으로 Leaky_ReLU 활성화함수를 적용했을 때의 보상 값이 높았고, 최대 보상 값은 SELU 활성화 함수를 적용할 때로 나타났다.
PDF

Development of reinforcement learning algorithm with countinuous action selection for acrobot (Acrobot 제어를 위한 강화학습에서의 연속적인 행위 선택 알고리즘의 개발)

Seo, Sung-Hwan;Jang, Si-Young;Suh, Il-Hong
- Proceedings of the KIEE Conference
- /
- 2003.07d
- /
- pp.2387-2389
- /
- 2003
Acrobat은 대표석인 비선형, underactuated 시스템이며, acrobot의 제어목적에는 swing-up 제어와 balancing 제어가 있다. 이 두 가지 제어목적을 달성하기 위해 기존에 많은 연구가 진행되었다. 그러나 이 방법들은 두 개의 독립적인 제어기를 acrobot의 상태에 따라 전환하여 사용하는 방법으로서 전환 시점의 선정기준에 대한 어려움과 두 가지 제어목적의 달성을 위한 전체 학습 시간지연의 문제점이 있다. 이를 개선하기 위하여 우리는 acrobot의 두 가지 제어목적을 동시에 해결할 수 있도록 기존에 연구하였던 연속적인 상태공간의 근사화가 가능한 영역기반 Q-학습(Region-based Q-Learning)[11]을 기반으로 한 하나의 제어기로 구현하는 방법을 연구하였다. 제안한 방법을 제작한 acrobot에 적용한 실험을 통하여 그 유용성을 검증하였다.
PDF

Adaptive Robust Swing-up and Balancing Control of Acrobot using a Fuzzy Disturbance Observer (퍼지 외란 관측기법을 이용한 아크로봇의 적응형 강인 스윙업 및 밸런싱제어)

Jeong, Seongchan;Lee, Sanghyob;Hong, Young-Dae;Chwa, Dongkyoung
- Journal of Institute of Control, Robotics and Systems
- /
- v.22 no.5
- /
- pp.346-352
- /
- 2016
This paper proposes an adaptive robust control method for an acrobot system in the presence of input disturbance. The acrobot system is a typical example of the underactuated system with complex nonlinearity and strong dynamic coupling. Also, disturbance can cause limit cycle phenomenon which appears in the acrobot system around the desired unstable equilibrium point. To minimize the effect of the disturbance, we apply a fuzzy disturbance estimation method for the swing-up and balancing control of the acrobot system. In this paper, both disturbance observer and controller for the acrobot system are designed and verified through mathematical proof and simulations.
https://doi.org/10.5302/J.ICROS.2016.16.0025 인용 PDF KSCI

Swing-up Control and Singular Problem of an Acrobot System

Nam, Taek-Kun;Tsutomu Mita
- 제어로봇시스템학회:학술대회논문집
- /
- 2001.10a
- /
- pp.104.5-104
- /
- 2001
In this paper, we address the swing up control and the singular problem of an acrobot. We derive a serial system equation from the acceleration constraint that there is no actuator on the first joint. Based on the serial system representation, we propose a swing up and stabilization control algorithm to move the acrobot from its downward equilibrium to its inverted equilibrium position. Simulation result is also provided to show the effectiveness of the proposed control strategy.
PDF

Credit-Assigned-CMAC-based Reinforcement Learn ing with Application to the Acrobot Swing Up Control Problem (Acrobot Swing Up Control을 위한 Credit-Assigned-CMAC-based 강화학습)

장시영;신연용;서승환;서일홍
- The Transactions of the Korean Institute of Electrical Engineers D
- /
- v.53 no.7
- /
- pp.517-524
- /
- 2004
For real world applications of reinforcement learning techniques, function approximation or generalization will be required to avoid curse of dimensionality. For this, an improved function approximation-based reinforcement teaming method is proposed to speed up convergence by using CA-CMAC(Credit-Assigned Cerebellar Model Articulation Controller). To show that our proposed CACRL(CA-CMAC-based Reinforcement Learning) performs better than the CRL(CMAC- based Reinforcement Learning), computer simulation and experiment results are illustrated, where a swing-up control Problem of an acrobot is considered.
PDF KSCI

Credit-Assigned-CMAC-based Reinforcement Learning with application to the Acrobot Swing Up Control Problem (Acrobot Swing Up 제어를 위한 Credit-Assigned-CMAC 기반의 강화학습)

Shin, Yeon-Yong;Jang, Si-Young;Seo, Seung-Hwan;Suh, Il-Hong
- Proceedings of the KIEE Conference
- /
- 2003.11c
- /
- pp.621-624
- /
- 2003
For real world applications of reinforcement learning techniques, function approximation or generalization will be required to avoid curse of dimensionality. For this, an improved function approximation-based reinforcement learning method is proposed to speed up convergence by using CA-CMAC(Credit-Assigned Cerebellar Model Articulation Controller). To show that our proposed CACRL(CA-CMAC-based Reinforcement Learning) performs better than the CRL(CMAC-based Reinforcement Learning), computer simulation results are illustrated, where a swing-up control problem of an acrobot is considered.
PDF

Design of Integral Sliding Mode Control for Underactuated Mechanical Systems (부족구동 기계시스템을 위한 적분 슬라이딩 모드 제어기 설계)

Yoo, Dong Sang
- Journal of the Korean Institute of Intelligent Systems
- /
- v.23 no.3
- /
- pp.208-213
- /
- 2013
The problem of finding control laws for underactuated systems has attracted growing attention since these systems are characterized by the fact that they have fewer actuators than the degrees of freedom to be controlled. A sliding mode control based on the theory of variable structure systems is a robust methodology to control nonlinear systems. In this paper, a sliding mode control with integral sliding function is proposed and asymptotical stability is proved in the Lyapunov's sense for underactuated systems. In order to verify the effectiveness of the proposed control, computer simulations for an acrobot, which is a representative underactuated system, are performed. Using Mathworks' Simulink/Simscape, the acrobot dynamics is implemented and the proposed control is composed. Simulations demonstrate the effectiveness and usefulness of the proposed control.
https://doi.org/10.5391/JKIIS.2013.23.3.208 인용 PDF KSCI

A Sufficient Condition for the Feedback Quasilinearization of Control Mechanical Systems

Chang, Dong Eui;Song, Seong-Ho;Kim, Jeom Keun
- Journal of Electrical Engineering and Technology
- /
- v.11 no.3
- /
- pp.741-745
- /
- 2016
We derive a sufficient condition for feedback quasilinearizability of control mechanical systems and apply it to show the feedback quasilinearizability of the Acrobot system.
https://doi.org/10.5370/JEET.2016.11.3.741 인용 PDF KSCI KPUBS HTML

Exponential Stabilization of a Class of Underactuated Mechanical Systems using Dynamic Surface Control

Qaiser, Nadeem;Iqbal, Naeem;Hussain, Amir;Qaiser, Naeem
- International Journal of Control, Automation, and Systems
- /
- v.5 no.5
- /
- pp.547-558
- /
- 2007
This paper proposes a simpler solution to the stabilization problem of a special class of nonlinear underactuated mechanical systems which includes widely studied benchmark systems like Inertia Wheel Pendulum, TORA and Acrobot. Complex internal dynamics and lack of exact feedback linearizibility of these systems makes design of control law a challenging task. Stabilization of these systems has been achieved using Energy Shaping and damping injection and Backstepping technique. Former results in hybrid or switching architectures that make stability analysis complicated whereas use of backstepping some times requires closed form explicit solutions of highly nonlinear equations resulting from partial feedback linearization. It also exhibits the phenomenon of explosions of terms resulting in a highly complicated control law. Exploiting recently introduced Dynamic Surface Control technique and using control Lyapunov function method, a novel nonlinear controller design is presented as a solution to these problems. The stability of the closed loop system is analyzed by exploiting its two-time scale nature and applying concepts from Singular Perturbation Theory. The design procedure is shown to be simpler and more intuitive than existing designs. Design has been applied to important benchmark systems belonging to the class demonstrating controller design simplicity. Advantages over conventional Energy Shaping and Backstepping controllers are analyzed theoretically and performance is verified using numerical simulations.
PDF KSCI

Search Result 9, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)