• Title/Summary/Keyword: Q-learning system

Search Result 142, Processing Time 0.041 seconds

Efficient Reinforcement Learning System in Multi-Agent Environment (다중 에이전트 환경에서 효율적인 강화학습 시스템)

  • Hong, Jung-Hwan;Kang, Jin-Beom;Choi, Joong-Min
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10b
    • /
    • pp.393-396
    • /
    • 2006
  • 강화학습은 환경과 상호작용하는 과정을 통하여 목표를 이루기 위한 전략을 학습하는 방법으로써 에이전트의 학습방법으로 많이 사용한다. 독립적인 에이전트가 아닌 상호 의사소통이 가능한 다중 에이전트 환경에서 에이전트의 학습정보를 서로 검색 및 공유가 가능하다면 환경이 거대하더라도 기존의 강화학습 보다 빠르게 학습이 이루어질 것이다. 하지만 아직 다중 에이전트 환경에서 학습 방법에 대한 연구가 미흡하여 학습정보의 검색과 공유에 대해 다양한 방법들이 요구되고 있다. 본 논문에서는 대상 에이전트 학습 정보와 주변 에이전트들의 학습 정보 사이에 편집거리를 비교하여 유사한 에이전트를 찾고 그 에이전트 정보를 강화학습 사전정보로 사용함으로써 학습속도를 향상시킨 ED+Q-Learning 시스템을 제안한다.

  • PDF

Optimal Scheduling of Satellite Tracking Antenna of GNSS System (다중위성 추적 안테나의 위성추적 최적 스케쥴링)

  • Ahn, Chae-Ik;Shin, Ho-Hyun;Kim, You-Dan;Jung, Seong-Kyun;Lee, Sang-Uk;Kim, Jae-Hoon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.36 no.7
    • /
    • pp.666-673
    • /
    • 2008
  • To construct the accurate radio satellite navigation system, the efficient communication each satellite with the ground station is very important. Throughout the communication, the orbit of each satellite can be corrected, and those information will be used to analyze the satellite satus by the operator. Since there are limited resources of ground station, the schedule of antenna's azimuth and elevation angle should be optimized. On the other hand, the satellite in the medium earth orbit does not pass the same point of the earth surface due to the rotation of the earth. Therefore, the antenna pass schedule must be updated at the proper moment. In this study, Q learning approach which is a form of model-free reinforcement learning and genetic algorithm are considered to find the optimal antenna schedule. To verify the optimality of the solution, numerical simulations are conducted.

Reinforcement Learning Based Energy Control Method for Smart Energy Buildings Integrated with V2G Station (강화학습 기반 V2G Station 연계형 스마트 에너지 빌딩 전력 제어 기법)

  • Seok-Min Choi;Sun-Yong Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.3
    • /
    • pp.515-522
    • /
    • 2024
  • Energy consumption is steadily increasing, and buildings in particular account for more than 20% of the total energy consumption around the world. As an effort to cost-effectively manage the energy consumption of buildings, many research groups have recently focused on Smart Building Energy Management Systems (BEMS), which are deepening the research depth by applying artificial intelligence(AI). In this paper, we propose a reinforcement learning-based energy control method for smart energy buildings integrated with V2G station, which aims to reduce the total energy cost of the building. The results of performance evaluation based on the energy consumption data measured in the real-world building shows that the proposed method can gradually reduce the total energy costs of the building as the learning process progresses.

Actor-Critic Reinforcement Learning System with Time-Varying Parameters

  • Obayashi, Masanao;Umesako, Kosuke;Oda, Tazusa;Kobayashi, Kunikazu;Kuremoto, Takashi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.138-141
    • /
    • 2003
  • Recently reinforcement learning has attracted attention of many researchers because of its simple and flexible learning ability for any environments. And so far many reinforcement learning methods have been proposed such as Q-learning, actor-critic, stochastic gradient ascent method and so on. The reinforcement learning system is able to adapt to changes of the environment because of the mutual action with it. However when the environment changes periodically, it is not able to adapt to its change well. In this paper we propose the reinforcement learning system that is able to adapt to periodical changes of the environment by introducing the time-varying parameters to be adjusted. It is shown that the proposed method works well through the simulation study of the maze problem with aisle that opens and closes periodically, although the conventional method with constant parameters to be adjusted does not works well in such environment.

  • PDF

Development of Optimal Design Technique of RC Beam using Multi-Agent Reinforcement Learning (다중 에이전트 강화학습을 이용한 RC보 최적설계 기술개발)

  • Kang, Joo-Won;Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.23 no.2
    • /
    • pp.29-36
    • /
    • 2023
  • Reinforcement learning (RL) is widely applied to various engineering fields. Especially, RL has shown successful performance for control problems, such as vehicles, robotics, and active structural control system. However, little research on application of RL to optimal structural design has conducted to date. In this study, the possibility of application of RL to structural design of reinforced concrete (RC) beam was investigated. The example of RC beam structural design problem introduced in previous study was used for comparative study. Deep q-network (DQN) is a famous RL algorithm presenting good performance in the discrete action space and thus it was used in this study. The action of DQN agent is required to represent design variables of RC beam. However, the number of design variables of RC beam is too many to represent by the action of conventional DQN. To solve this problem, multi-agent DQN was used in this study. For more effective reinforcement learning process, DDQN (Double Q-Learning) that is an advanced version of a conventional DQN was employed. The multi-agent of DDQN was trained for optimal structural design of RC beam to satisfy American Concrete Institute (318) without any hand-labeled dataset. Five agents of DDQN provides actions for beam with, beam depth, main rebar size, number of main rebar, and shear stirrup size, respectively. Five agents of DDQN were trained for 10,000 episodes and the performance of the multi-agent of DDQN was evaluated with 100 test design cases. This study shows that the multi-agent DDQN algorithm can provide successfully structural design results of RC beam.

The Development of an Intelligent Home Energy Management System Integrated with a Vehicle-to-Home Unit using a Reinforcement Learning Approach

  • Ohoud Almughram;Sami Ben Slama;Bassam Zafar
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.4
    • /
    • pp.87-106
    • /
    • 2024
  • Vehicle-to-Home (V2H) and Home Centralized Photovoltaic (HCPV) systems can address various energy storage issues and enhance demand response programs. Renewable energy, such as solar energy and wind turbines, address the energy gap. However, no energy management system is currently available to regulate the uncertainty of renewable energy sources, electric vehicles, and appliance consumption within a smart microgrid. Therefore, this study investigated the impact of solar photovoltaic (PV) panels, electric vehicles, and Micro-Grid (MG) storage on maximum solar radiation hours. Several Deep Learning (DL) algorithms were applied to account for the uncertainty. Moreover, a Reinforcement Learning HCPV (RL-HCPV) algorithm was created for efficient real-time energy scheduling decisions. The proposed algorithm managed the energy demand between PV solar energy generation and vehicle energy storage. RL-HCPV was modeled according to several constraints to meet household electricity demands in sunny and cloudy weather. Simulations demonstrated how the proposed RL-HCPV system could efficiently handle the demand response and how V2H can help to smooth the appliance load profile and reduce power consumption costs with sustainable power generation. The results demonstrated the advantages of utilizing RL and V2H as potential storage technology for smart buildings.

Opportunistic Spectrum Access with Discrete Feedback in Unknown and Dynamic Environment:A Multi-agent Learning Approach

  • Gao, Zhan;Chen, Junhong;Xu, Yuhua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.10
    • /
    • pp.3867-3886
    • /
    • 2015
  • This article investigates the problem of opportunistic spectrum access in dynamic environment, in which the signal-to-noise ratio (SNR) is time-varying. Different from existing work on continuous feedback, we consider more practical scenarios in which the transmitter receives an Acknowledgment (ACK) if the received SNR is larger than the required threshold, and otherwise a Non-Acknowledgment (NACK). That is, the feedback is discrete. Several applications with different threshold values are also considered in this work. The channel selection problem is formulated as a non-cooperative game, and subsequently it is proved to be a potential game, which has at least one pure strategy Nash equilibrium. Following this, a multi-agent Q-learning algorithm is proposed to converge to Nash equilibria of the game. Furthermore, opportunistic spectrum access with multiple discrete feedbacks is also investigated. Finally, the simulation results verify that the proposed multi-agent Q-learning algorithm is applicable to both situations with binary feedback and multiple discrete feedbacks.

The Effect of Input Variables Clustering on the Characteristics of Ensemble Machine Learning Model for Water Quality Prediction (입력자료 군집화에 따른 앙상블 머신러닝 모형의 수질예측 특성 연구)

  • Park, Jungsu
    • Journal of Korean Society on Water Environment
    • /
    • v.37 no.5
    • /
    • pp.335-343
    • /
    • 2021
  • Water quality prediction is essential for the proper management of water supply systems. Increased suspended sediment concentration (SSC) has various effects on water supply systems such as increased treatment cost and consequently, there have been various efforts to develop a model for predicting SSC. However, SSC is affected by both the natural and anthropogenic environment, making it challenging to predict SSC. Recently, advanced machine learning models have increasingly been used for water quality prediction. This study developed an ensemble machine learning model to predict SSC using the XGBoost (XGB) algorithm. The observed discharge (Q) and SSC in two fields monitoring stations were used to develop the model. The input variables were clustered in two groups with low and high ranges of Q using the k-means clustering algorithm. Then each group of data was separately used to optimize XGB (Model 1). The model performance was compared with that of the XGB model using the entire data (Model 2). The models were evaluated by mean squared error-ob servation standard deviation ratio (RSR) and root mean squared error. The RSR were 0.51 and 0.57 in the two monitoring stations for Model 2, respectively, while the model performance improved to RSR 0.46 and 0.55, respectively, for Model 1.

Thrust and Propellant Mixture Ratio Control of Open Type Liquid Propellant Rocket Engine (개방형 액체추진제로켓엔진의 추력 및 혼합비 제어)

  • Jung, Young-Suk;Lee, Jung-Ho;Oh, Seung-Hyub
    • Proceedings of the KSME Conference
    • /
    • 2007.05a
    • /
    • pp.1143-1148
    • /
    • 2007
  • LRE(Liquid propellant Rocket Engine) is one of the important parts to control the motion of rocket. For operation of rocket in error boundary of the set-up trajectory, it is necessarily to control the thrust of LRE according to the required thrust profile and control the mixture ratio of propellants fed into combustor for the constant mixture ratio. It is not easy to control thrust and mixture ratio of propellants since there are co-interferences among the components of LRE. In this study, the dynamic model of LRE was constructed and the dynamic characteristics were analyzed with control system as PID control and PID+Q-ILC(Iterative Learning Control with Quadratic Criterion) control. From the analysis, it could be observed that PID+Q-ILC control logic is more useful than standard PID control system for control of LRE.

  • PDF

A strategic Q&A system for self-directed study (자기주도적 학습을 위한 전략형 Q&A 시스템)

  • Lee, Hae-Bok;Kim, Kap-Su
    • Journal of The Korean Association of Information Education
    • /
    • v.6 no.1
    • /
    • pp.13-29
    • /
    • 2002
  • Mathematical curriculum has been developed based on learners' level and difficulties of contents. Succeed in solving problem in mathematics depends on the completion of the precedent learning. Thus, it is important to diagnose students beforehand. It is also important to develop problem-solving skills for students. In this thesis, Q&A system is proposed to help students learn various problem solving skills in mathematics. Although the system is currently applicable to mathematics, it can be applied to any other subjects.

  • PDF