• Title/Summary/Keyword: DQnA

Search Result 63, Processing Time 0.017 seconds

Power Trading System through the Prediction of Demand and Supply in Distributed Power System Based on Deep Reinforcement Learning (심층강화학습 기반 분산형 전력 시스템에서의 수요와 공급 예측을 통한 전력 거래시스템)

  • Lee, Seongwoo;Seon, Joonho;Kim, Soo-Hyun;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.6
    • /
    • pp.163-171
    • /
    • 2021
  • In this paper, the energy transaction system was optimized by applying a resource allocation algorithm and deep reinforcement learning in the distributed power system. The power demand and supply environment were predicted by deep reinforcement learning. We propose a system that pursues common interests in power trading and increases the efficiency of long-term power transactions in the paradigm shift from conventional centralized to distributed power systems in the power trading system. For a realistic energy simulation model and environment, we construct the energy market by learning weather and monthly patterns adding Gaussian noise. In simulation results, we confirm that the proposed power trading systems are cooperative with each other, seek common interests, and increase profits in the prolonged energy transaction.

Max-Mean N-step Temporal-Difference Learning Using Multi-Step Return (멀티-스텝 누적 보상을 활용한 Max-Mean N-Step 시간차 학습)

  • Hwang, Gyu-Young;Kim, Ju-Bong;Heo, Joo-Seong;Han, Youn-Hee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.5
    • /
    • pp.155-162
    • /
    • 2021
  • n-step TD learning is a combination of Monte Carlo method and one-step TD learning. If appropriate n is selected, n-step TD learning is known as an algorithm that performs better than Monte Carlo method and 1-step TD learning, but it is difficult to select the best values of n. In order to solve the difficulty of selecting the values of n in n-step TD learning, in this paper, using the characteristic that overestimation of Q can improve the performance of initial learning and that all n-step returns have similar values for Q ≈ Q*, we propose a new learning target, which is composed of the maximum and the mean of all k-step returns for 1 ≤ k ≤ n. Finally, in OpenAI Gym's Atari game environment, we compare the proposed algorithm with n-step TD learning and proved that the proposed algorithm is superior to n-step TD learning algorithm.

Decision Support System of Obstacle Avoidance for Mobile Vehicles (다양한 자율주행 이동체에 적용하기 위한 장애물 회피의사 결정 시스템 연구)

  • Kang, Byung-Jun;Kim, Jongwon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.6
    • /
    • pp.639-645
    • /
    • 2018
  • This paper is intended to develop a decision model that can be applied to autonomous vehicles and autonomous mobile vehicles. The developed module has an independent configuration for application in various driving environments and is based on a platform for organically operating them. Each module is studied for decision making on lane changes and for securing safety through reinforcement learning using a deep learning technique. The autonomous mobile moving body operating to change the driving state has a characteristic where the next operation of the mobile body can be determined only if the definition of the speed determination model (according to its functions) and the lane change decision are correctly preceded. Also, if all the moving bodies traveling on a general road are equipped with an autonomous driving function, it is difficult to consider the factors that may occur between each mobile unit from unexpected environmental changes. Considering these factors, we applied the decision model to the platform and studied the lane change decision system for implementation of the platform. We studied the decision model using a modular learning method to reduce system complexity, to reduce the learning time, and to consider model replacement.