• Title/Summary/Keyword: Distributed Reinforcement Learning

Search Result 35, Processing Time 0.031 seconds

DYNAMIC ROUTE PLANNING BY Q-LEARNING -Cellular Automation Based Simulator and Control

  • Sano, Masaki;Jung, Si
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.24.2-24
    • /
    • 2001
  • In this paper, the authors present a row dynamic route planning by Q-learning. The proposed algorithm is executed in a cellular automation based traffic simulator, which is also newly created. In Vehicle Information and Communication System(VICS), which is an active field of Intelligent Transport System(ITS), information of traffic congestion is sent to each vehicle at real time. However, a centralized navigation system is not realistic to guide millions of vehicles in a megalopolis. Autonomous distributed systems should be more flexible and scalable, and also have a chance to focus on each vehicles demand. In such systems, each vehicle can search an own optimal route. We employ Q-learning of the reinforcement learning method to search an optimal or sub-optimal route, in which route drivers can avoid traffic congestions. We find some applications of the reinforcement learning in the "static" environment, but there are ...

  • PDF

Reinforcement Learning Based Evolution and Learning Algorithm for Cooperative Behavior of Swarm Robot System (군집 로봇의 협조 행동을 위한 강화 학습 기반의 진화 및 학습 알고리즘)

  • Seo, Sang-Wook;Kim, Ho-Duck;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.5
    • /
    • pp.591-597
    • /
    • 2007
  • In swarm robot systems, each robot must behaves by itself according to the its states and environments, and if necessary, must cooperates with other robots in order to carry out a given task. Therefore it is essential that each robot has both learning and evolution ability to adapt the dynamic environments. In this paper, the new polygon based Q-learning algorithm and distributed genetic algorithms are proposed for behavior learning and evolution of collective autonomous mobile robots. And by distributed genetic algorithm exchanging the chromosome acquired under different environments by communication each robot can improve its behavior ability Specially, in order to improve the performance of evolution, selective crossover using the characteristic of reinforcement learning is adopted in this paper. we verify the effectiveness of the proposed method by applying it to cooperative search problem.

A Reinforcement Learning Framework for Autonomous Cell Activation and Customized Energy-Efficient Resource Allocation in C-RANs

  • Sun, Guolin;Boateng, Gordon Owusu;Huang, Hu;Jiang, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.3821-3841
    • /
    • 2019
  • Cloud radio access networks (C-RANs) have been regarded in recent times as a promising concept in future 5G technologies where all DSP processors are moved into a central base band unit (BBU) pool in the cloud, and distributed remote radio heads (RRHs) compress and forward received radio signals from mobile users to the BBUs through radio links. In such dynamic environment, automatic decision-making approaches, such as artificial intelligence based deep reinforcement learning (DRL), become imperative in designing new solutions. In this paper, we propose a generic framework of autonomous cell activation and customized physical resource allocation schemes for energy consumption and QoS optimization in wireless networks. We formulate the problem as fractional power control with bandwidth adaptation and full power control and bandwidth allocation models and set up a Q-learning model to satisfy the QoS requirements of users and to achieve low energy consumption with the minimum number of active RRHs under varying traffic demand and network densities. Extensive simulations are conducted to show the effectiveness of our proposed solution compared to existing schemes.

A slide reinforcement learning for the consensus of a multi-agents system (다중 에이전트 시스템의 컨센서스를 위한 슬라이딩 기법 강화학습)

  • Yang, Janghoon
    • Journal of Advanced Navigation Technology
    • /
    • v.26 no.4
    • /
    • pp.226-234
    • /
    • 2022
  • With advances in autonomous vehicles and networked control, there is a growing interest in the consensus control of a multi-agents system to control multi-agents with distributed control beyond the control of a single agent. Since consensus control is a distributed control, it is bound to have delay in a practical system. In addition, it is often difficult to have a very accurate mathematical model for a system. Even though a reinforcement learning (RL) method was developed to deal with these issues, it often experiences slow convergence in the presence of large uncertainties. Thus, we propose a slide RL which combines the sliding mode control with RL to be robust to the uncertainties. The structure of a sliding mode control is introduced to the action in RL while an auxiliary sliding variable is included in the state information. Numerical simulation results show that the slide RL provides comparable performance to the model-based consensus control in the presence of unknown time-varying delay and disturbance while outperforming existing state-of-the-art RL-based consensus algorithms.

Adaptive Success Rate-based Sensor Relocation for IoT Applications

  • Kim, Moonseong;Lee, Woochan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.9
    • /
    • pp.3120-3137
    • /
    • 2021
  • Small-sized IoT wireless sensing devices can be deployed with small aircraft such as drones, and the deployment of mobile IoT devices can be relocated to suit data collection with efficient relocation algorithms. However, the terrain may not be able to predict its shape. Mobile IoT devices suitable for these terrains are hopping devices that can move with jumps. So far, most hopping sensor relocation studies have made the unrealistic assumption that all hopping devices know the overall state of the entire network and each device's current state. Recent work has proposed the most realistic distributed network environment-based relocation algorithms that do not require sharing all information simultaneously. However, since the shortest path-based algorithm performs communication and movement requests with terminals, it is not suitable for an area where the distribution of obstacles is uneven. The proposed scheme applies a simple Monte Carlo method based on relay nodes selection random variables that reflect the obstacle distribution's characteristics to choose the best relay node as reinforcement learning, not specific relay nodes. Using the relay node selection random variable could significantly reduce the generation of additional messages that occur to select the shortest path. This paper's additional contribution is that the world's first distributed environment-based relocation protocol is proposed reflecting real-world physical devices' characteristics through the OMNeT++ simulator. We also reconstruct the three days-long disaster environment, and performance evaluation has been performed by applying the proposed protocol to the simulated real-world environment.

Multi-agent Coordination Strategy Using Reinforcement Learning (강화 학습을 이용한 다중 에이전트 조정 전략)

  • Kim, Su-Hyun;Kim, Byung-Cheon;Yoon, Byung-Joo
    • Annual Conference of KIPS
    • /
    • 2000.10a
    • /
    • pp.285-288
    • /
    • 2000
  • 본 논문에서는 다중 에이전트(multi-agent) 환경에서 에이전트들의 행동을 효율적으로 조정 (coordination)하기 위해 강화 학습(reinforcement learning)을 이용하였다. 제안된 방법은 각 에이전트가 목표(goal)와의 거리 관계(distance relationship)와 인접 에이전트들과의 공간 관계(spatial relationship)를 이용하였다. 그러므로 각 에이전트는 다른 에이전트와 충돌(collision) 현상이 발생하지 않으면서, 최적의 다음 상태를 선택할 수 있다. 또한, 상태 공간으로부터 입력되는 강화 값이 0과 1 사이의 값을 갖기 때문에 각 에이전트가 선택한 (상태, 행동) 쌍이 얼마나 좋은가를 나타낼 수 있다. 제안된 방법을 먹이 포획 문제(prey pursuit problem)에 적용한 결과 지역 제어(local control)나. 분산 제어(distributed control) 전략을 이용한 방법보다 여러 에이전트들의 행동을 효율적으로 조정할 수 있었으며, 매우 빠르게 먹이를 포획할 수 있음을 알 수 있었다.

  • PDF

Card Battle Game Agent Based on Reinforcement Learning with Play Level Control (플레이 수준 조절이 가능한 강화학습 기반 카드형 대전 게임 에이전트)

  • Yong Cheol Lee;Chill woo Lee
    • Smart Media Journal
    • /
    • v.13 no.2
    • /
    • pp.32-43
    • /
    • 2024
  • Game agents which are behavioral agent for game playing are a crucial component of game satisfaction. However it takes a lot of time and effort to create game agents for various game levels, environments, and players. In addition, when the game environment changes such as adding contents or updating characters, new game agents need to be developed and the development difficulty gradually increases. And it is important to have a game agent that can be customized for different levels of players. This is because a game agent that can play games of various levels is more useful and can increase the satisfaction of more players than a high-level game agent. In this paper, we propose a method for learning and controlling the level of play of game agents that can be rapidly developed and fine-tuned for various game environments and changes. At this time, reinforcement learning applies a policy-based distributed reinforcement learning method IMPALA for flexible processing and fast learning of various behavioral structures. Once reinforcement learning is complete, we choose actions by sampling based on Softmax-Temperature method. From this result, we show that the game agent's play level decreases as the Temperature value increases. This shows that it is possible to easily control the play level.

Path selection algorithm for multi-path system based on deep Q learning (Deep Q 학습 기반의 다중경로 시스템 경로 선택 알고리즘)

  • Chung, Byung Chang;Park, Heasook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.50-55
    • /
    • 2021
  • Multi-path system is a system in which utilizes various networks simultaneously. It is expected that multi-path system can enhance communication speed, reliability, security of network. In this paper, we focus on path selection in multi-path system. To select optimal path, we propose deep reinforcement learning algorithm which is rewarded by the round-trip-time (RTT) of each networks. Unlike multi-armed bandit model, deep Q learning is applied to consider rapidly changing situations. Due to the delay of RTT data, we also suggest compensation algorithm of the delayed reward. Moreover, we implement testbed learning server to evaluate the performance of proposed algorithm. The learning server contains distributed database and tensorflow module to efficiently operate deep learning algorithm. By means of simulation, we showed that the proposed algorithm has better performance than lowest RTT about 20%.

Application of Multi-agent Reinforcement Learning to CELSS Material Circulation Control

  • Hirosaki, Tomofumi;Yamauchi, Nao;Yoshida, Hiroaki;Ishikawa, Yoshio;Miyajima, Hiroyuki
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.145-150
    • /
    • 2001
  • A Controlled Ecological Life Support System(CELSS) is essential for man to live a long time in a closed space such as a lunar base or a mars base. Such a system may be an extremely complex system that has a lot of facilities and circulates multiple substances,. Therefore, it is very difficult task to control the whole CELSS. Thus by regarding facilities constituting the CELSS as agents and regarding the status and action as information, the whole CELSS can be treated as multi-agent system(MAS). If a CELSS can be regarded as MAS the CELSS can have three advantages with the MAS. First the MAS need not have a central computer. Second the expendability of the CELSS increases. Third, its fault tolerance rises. However it is difficult to describe the cooperation protocol among agents for MAS. Therefore in this study we propose to apply reinforcement learning (RL), because RL enables and agent to acquire a control rule automatically. To prove that MAS and RL are effective methods. we have created the system in Java, which easily gives a distributed environment that is the characteristics feature of an agent. In this paper, we report the simulation results for material circulation control of the CELSS by the MAS and RL.

  • PDF

An Automatic Cooperative coordination Model for the Multiagent System using Reinforcement Learning (강화학습을 이용한 멀티 에이전트 시스템의 자동 협력 조정 모델)

  • 정보윤;윤소정;오경환
    • Korean Journal of Cognitive Science
    • /
    • v.10 no.1
    • /
    • pp.1-11
    • /
    • 1999
  • Agent-based systems technology has generated lots of excitement in these years because of its promise as a new paradigm for conceptualizing. designing. and l implementing software systems Especially, there has been many researches for multi agent system because of the characteristics that it fits to the distributed and open Internet environments. In a multiagent system. agents must cooperate with each other through a Coordination procedure. when the conflicts between agents arise. where those are caused b by the point that each action acts for a purpose separately without coordination. But P previous researches for coordination methods in multi agent system have a deficiency that they can not solve correctly the cooperation problem between agents which have different goals in dynamic environment. In this paper. we solve the cooperation problem of multiagent that has multiple goals in a dynamic environment. with an automatic cooperative coordination model using I reinforcement learning. We will show the two pursuit problems that we extend a traditional problem in multi agent systems area for modeling the restriction in the multiple goals in a dynamic environment. and we have verified the validity of the proposed model with an experiment.

  • PDF