• Title/Summary/Keyword: reinforcement algorithms

Search Result 149, Processing Time 0.025 seconds

Reinforcement Learning using Propagation of Goal-State-Value (목표상태 값 전파를 이용한 강화 학습)

  • Kim, Byeong-Cheon;Yun, Byeong-Ju
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.5
    • /
    • pp.1303-1311
    • /
    • 1999
  • In order to learn in dynamic environments, reinforcement learning algorithms like Q-learning, TD(0)-learning, TD(λ)-learning have been proposed. however, most of them have a drawback of very slow learning because the reinforcement value is given when they reach their goal state. In this thesis, we have proposed a reinforcement learning method that can approximate fast to the goal state in maze environments. The proposed reinforcement learning method is separated into global learning and local learning, and then it executes learning. Global learning is a learning that uses the replacing eligibility trace method to search the goal state. In local learning, it propagates the goal state value that has been searched through global learning to neighboring sates, and then searches goal state in neighboring states. we can show through experiments that the reinforcement learning method proposed in this thesis can find out an optimal solution faster than other reinforcement learning methods like Q-learning, TD(o)learning and TD(λ)-learning.

  • PDF

Path Planning of Unmanned Aerial Vehicle based Reinforcement Learning using Deep Q Network under Simulated Environment (시뮬레이션 환경에서의 DQN을 이용한 강화 학습 기반의 무인항공기 경로 계획)

  • Lee, Keun Hyoung;Kim, Shin Dug
    • Journal of the Semiconductor & Display Technology
    • /
    • v.16 no.3
    • /
    • pp.127-130
    • /
    • 2017
  • In this research, we present a path planning method for an autonomous flight of unmanned aerial vehicles (UAVs) through reinforcement learning under simulated environment. We design the simulator for reinforcement learning of uav. Also we implement interface for compatibility of Deep Q-Network(DQN) and simulator. In this paper, we perform reinforcement learning through the simulator and DQN, and use Q-learning algorithm, which is a kind of reinforcement learning algorithms. Through experimentation, we verify performance of DQN-simulator. Finally, we evaluated the learning results and suggest path planning strategy using reinforcement learning.

  • PDF

Development of Semi-Active Control Algorithm Using Deep Q-Network (Deep Q-Network를 이용한 준능동 제어알고리즘 개발)

  • Kim, Hyun-Su;Kang, Joo-Won
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.1
    • /
    • pp.79-86
    • /
    • 2021
  • Control performance of a smart tuned mass damper (TMD) mainly depends on control algorithms. A lot of control strategies have been proposed for semi-active control devices. Recently, machine learning begins to be applied to development of vibration control algorithm. In this study, a reinforcement learning among machine learning techniques was employed to develop a semi-active control algorithm for a smart TMD. The smart TMD was composed of magnetorheological damper in this study. For this purpose, an 11-story building structure with a smart TMD was selected to construct a reinforcement learning environment. A time history analysis of the example structure subject to earthquake excitation was conducted in the reinforcement learning procedure. Deep Q-network (DQN) among various reinforcement learning algorithms was used to make a learning agent. The command voltage sent to the MR damper is determined by the action produced by the DQN. Parametric studies on hyper-parameters of DQN were performed by numerical simulations. After appropriate training iteration of the DQN model with proper hyper-parameters, the DQN model for control of seismic responses of the example structure with smart TMD was developed. The developed DQN model can effectively control smart TMD to reduce seismic responses of the example structure.

Improved Deep Q-Network Algorithm Using Self-Imitation Learning (Self-Imitation Learning을 이용한 개선된 Deep Q-Network 알고리즘)

  • Sunwoo, Yung-Min;Lee, Won-Chang
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.644-649
    • /
    • 2021
  • Self-Imitation Learning is a simple off-policy actor-critic algorithm that makes an agent find an optimal policy by using past good experiences. In case that Self-Imitation Learning is combined with reinforcement learning algorithms that have actor-critic architecture, it shows performance improvement in various game environments. However, its applications are limited to reinforcement learning algorithms that have actor-critic architecture. In this paper, we propose a method of applying Self-Imitation Learning to Deep Q-Network which is a value-based deep reinforcement learning algorithm and train it in various game environments. We also show that Self-Imitation Learning can be applied to Deep Q-Network to improve the performance of Deep Q-Network by comparing the proposed algorithm and ordinary Deep Q-Network training results.

Cloud Task Scheduling Based on Proximal Policy Optimization Algorithm for Lowering Energy Consumption of Data Center

  • Yang, Yongquan;He, Cuihua;Yin, Bo;Wei, Zhiqiang;Hong, Bowei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1877-1891
    • /
    • 2022
  • As a part of cloud computing technology, algorithms for cloud task scheduling place an important influence on the area of cloud computing in data centers. In our earlier work, we proposed DeepEnergyJS, which was designed based on the original version of the policy gradient and reinforcement learning algorithm. We verified its effectiveness through simulation experiments. In this study, we used the Proximal Policy Optimization (PPO) algorithm to update DeepEnergyJS to DeepEnergyJSV2.0. First, we verify the convergence of the PPO algorithm on the dataset of Alibaba Cluster Data V2018. Then we contrast it with reinforcement learning algorithm in terms of convergence rate, converged value, and stability. The results indicate that PPO performed better in training and test data sets compared with reinforcement learning algorithm, as well as other general heuristic algorithms, such as First Fit, Random, and Tetris. DeepEnergyJSV2.0 achieves better energy efficiency than DeepEnergyJS by about 7.814%.

Mapless Navigation with Distributional Reinforcement Learning (분포형 강화학습을 활용한 맵리스 네비게이션)

  • Van Manh Tran;Gon-Woo Kim
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.1
    • /
    • pp.92-97
    • /
    • 2024
  • This paper provides a study of distributional perspective on reinforcement learning for application in mobile robot navigation. Mapless navigation algorithms based on deep reinforcement learning are proven to promising performance and high applicability. The trial-and-error simulations in virtual environments are encouraged to implement autonomous navigation due to expensive real-life interactions. Nevertheless, applying the deep reinforcement learning model in real tasks is challenging due to dissimilar data collection between virtual simulation and the physical world, leading to high-risk manners and high collision rate. In this paper, we present distributional reinforcement learning architecture for mapless navigation of mobile robot that adapt the uncertainty of environmental change. The experimental results indicate the superior performance of distributional soft actor critic compared to conventional methods.

Fuzzy Inferdence-based Reinforcement Learning for Recurrent Neural Network (퍼지 추론에 의한 리커런트 뉴럴 네트워크 강화학습)

  • 전효병;이동욱;김대준;심귀보
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1997.11a
    • /
    • pp.120-123
    • /
    • 1997
  • In this paper, we propose the Fuzzy Inference-based Reinforcement Learning Algorithm. We offer more similar learning scheme to the psychological learning of the higher animal's including human, by using Fuzzy Inference in Reinforcement Learning. The proposed method follows the way linguistic and conceptional expression have an effect on human's behavior by reasoning reinforcement based on fuzzy rule. The intervals of fuzzy membership functions are found optimally by genetic algorithms. And using Recurrent state is considered to make an action in dynamical environment. We show the validity of the proposed learning algorithm by applying to the inverted pendulum control problem.

  • PDF

Application of reinforcement learning to fire suppression system of an autonomous ship in irregular waves

  • Lee, Eun-Joo;Ruy, Won-Sun;Seo, Jeonghwa
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.12 no.1
    • /
    • pp.910-917
    • /
    • 2020
  • In fire suppression, continuous delivery of water or foam to the fire source is essential. The present study concerns fire suppression in a ship under sea condition, by introducing reinforcement learning technique to aiming of fire extinguishing nozzle, which works in a ship compartment with six degrees of freedom movement by irregular waves. The physical modeling of the water jet and compartment motion was provided using Unity 3D engine. In the reinforcement learning, the change of the nozzle angle during the scenario was set as the action, while the reward is proportional to the ratio of the water particle delivered to the fire source area. The optimal control of nozzle aiming for continuous delivery of water jet could be derived. Various algorithms of reinforcement learning were tested to select the optimal one, the proximal policy optimization.

Performance Analysis of Deep Reinforcement Learning for Crop Yield Prediction (작물 생산량 예측을 위한 심층강화학습 성능 분석)

  • Ohnmar Khin;Sung-Keun Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.1
    • /
    • pp.99-106
    • /
    • 2023
  • Recently, many studies on crop yield prediction using deep learning technology have been conducted. These algorithms have difficulty constructing a linear map between input data sets and crop prediction results. Furthermore, implementation of these algorithms positively depends on the rate of acquired attributes. Deep reinforcement learning can overcome these limitations. This paper analyzes the performance of DQN, Double DQN and Dueling DQN to improve crop yield prediction. The DQN algorithm retains the overestimation problem. Whereas, Double DQN declines the over-estimations and leads to getting better results. The proposed models achieves these by reducing the falsehood and increasing the prediction exactness.

Deep Reinforcement Learning-Based C-V2X Distributed Congestion Control for Real-Time Vehicle Density Response (실시간 차량 밀도에 대응하는 심층강화학습 기반 C-V2X 분산혼잡제어)

  • Byeong Cheol Jeon;Woo Yoel Yang;Han-Shin Jo
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.379-385
    • /
    • 2023
  • Distributed congestion control (DCC) is a technology that mitigates channel congestion and improves communication performance in high-density vehicular networks. Traditional DCC techniques operate to reduce channel congestion without considering quality of service (QoS) requirements. Such design of DCC algorithms can lead to excessive DCC actions, potentially degrading other aspects of QoS. To address this issue, we propose a deep reinforcement learning-based QoS-adaptive DCC algorithm. The simulation was conducted using a quasi-real environment simulator, generating dynamic vehicular densities for evaluation. The simulation results indicate that our proposed DCC algorithm achieves results closer to the targeted QoS compared to existing DCC algorithms.