Search | Korea Science

A Naive Bayesian-based Model of the Opponent's Policy for Efficient Multiagent Reinforcement Learning (효율적인 멀티 에이전트 강화 학습을 위한 나이브 베이지만 기반 상대 정책 모델)

Kwon, Ki-Duk
- Journal of Internet Computing and Services
- /
- v.9 no.6
- /
- pp.165-177
- /
- 2008
An important issue in Multiagent reinforcement learning is how an agent should learn its optimal policy in a dynamic environment where there exist other agents able to influence its own performance. Most previous works for Multiagent reinforcement learning tend to apply single-agent reinforcement learning techniques without any extensions or require some unrealistic assumptions even though they use explicit models of other agents. In this paper, a Naive Bayesian based policy model of the opponent agent is introduced and then the Multiagent reinforcement learning method using this model is explained. Unlike previous works, the proposed Multiagent reinforcement learning method utilizes the Naive Bayesian based policy model, not the Q function model of the opponent agent. Moreover, this learning method can improve learning efficiency by using a simpler one than other richer but time-consuming policy models such as Finite State Machines(FSM) and Markov chains. In this paper, the Cat and Mouse game is introduced as an adversarial Multiagent environment. And then effectiveness of the proposed Naive Bayesian based policy model is analyzed through experiments using this game as test-bed.
PDF

Policy Modeling for Efficient Reinforcement Learning in Adversarial Multi-Agent Environments (적대적 멀티 에이전트 환경에서 효율적인 강화 학습을 위한 정책 모델링)

Kwon, Ki-Duk;Kim, In-Cheol
- Journal of KIISE:Software and Applications
- /
- v.35 no.3
- /
- pp.179-188
- /
- 2008
An important issue in multiagent reinforcement learning is how an agent should team its optimal policy through trial-and-error interactions in a dynamic environment where there exist other agents able to influence its own performance. Most previous works for multiagent reinforcement teaming tend to apply single-agent reinforcement learning techniques without any extensions or are based upon some unrealistic assumptions even though they build and use explicit models of other agents. In this paper, basic concepts that constitute the common foundation of multiagent reinforcement learning techniques are first formulated, and then, based on these concepts, previous works are compared in terms of characteristics and limitations. After that, a policy model of the opponent agent and a new multiagent reinforcement learning method using this model are introduced. Unlike previous works, the proposed multiagent reinforcement learning method utilize a policy model instead of the Q function model of the opponent agent. Moreover, this learning method can improve learning efficiency by using a simpler one than other richer but time-consuming policy models such as Finite State Machines(FSM) and Markov chains. In this paper. the Cat and Mouse game is introduced as an adversarial multiagent environment. And effectiveness of the proposed multiagent reinforcement learning method is analyzed through experiments using this game as testbed.
PDF KSCI

Two tales of platoon intelligence for autonomous mobility control: Enabling deep learning recipes

Soohyun Park;Haemin Lee;Chanyoung Park;Soyi Jung;Minseok Choi;Joongheon Kim
- ETRI Journal
- /
- v.45 no.5
- /
- pp.735-745
- /
- 2023
This paper surveys recent multiagent reinforcement learning and neural Myerson auction deep learning efforts to improve mobility control and resource management in autonomous ground and aerial vehicles. The multiagent reinforcement learning communication network (CommNet) was introduced to enable multiple agents to perform actions in a distributed manner to achieve shared goals by training all agents' states and actions in a single neural network. Additionally, the Myerson auction method guarantees trustworthiness among multiple agents to optimize rewards in highly dynamic systems. Our findings suggest that the integration of MARL CommNet and Myerson techniques is very much needed for improved efficiency and trustworthiness.
https://doi.org/10.4218/etrij.2023-0132 인용 PDF

C-COMA: A Continual Reinforcement Learning Model for Dynamic Multiagent Environments (C-COMA: 동적 다중 에이전트 환경을 위한 지속적인 강화 학습 모델)

Jung, Kyueyeol;Kim, Incheol
- KIPS Transactions on Software and Data Engineering
- /
- v.10 no.4
- /
- pp.143-152
- /
- 2021
It is very important to learn behavioral policies that allow multiple agents to work together organically for common goals in various real-world applications. In this multi-agent reinforcement learning (MARL) environment, most existing studies have adopted centralized training with decentralized execution (CTDE) methods as in effect standard frameworks. However, this multi-agent reinforcement learning method is difficult to effectively cope with in a dynamic environment in which new environmental changes that are not experienced during training time may constantly occur in real life situations. In order to effectively cope with this dynamic environment, this paper proposes a novel multi-agent reinforcement learning system, C-COMA. C-COMA is a continual learning model that assumes actual situations from the beginning and continuously learns the cooperative behavior policies of agents without dividing the training time and execution time of the agents separately. In this paper, we demonstrate the effectiveness and excellence of the proposed model C-COMA by implementing a dynamic mini-game based on Starcraft II, a representative real-time strategy game, and conducting various experiments using this environment.
https://doi.org/10.3745/KTSDE.2021.10.4.143 인용 PDF KSCI

Survey on Recent Advances in Multiagent Reinforcement Learning Focusing on Decentralized Training with Decentralized Execution Framework (멀티에이전트 강화학습 기술 동향: 분산형 훈련-분산형 실행 프레임워크를 중심으로)

Y.H. Shin;S.W. Seo;B.H. Yoo;H.W. Kim;H.J. Song;S. Yi
- Electronics and Telecommunications Trends
- /
- v.38 no.4
- /
- pp.95-103
- /
- 2023
The importance of the decentralized training with decentralized execution (DTDE) framework is well-known in the study of multiagent reinforcement learning. In many real-world environments, agents cannot share information. Hence, they must be trained in a decentralized manner. However, the DTDE framework has been less studied than the centralized training with decentralized execution framework. One of the main reasons is that many problems arise when training agents in a decentralized manner. For example, DTDE algorithms are often computationally demanding or can encounter problems with non-stationarity. Another reason is the lack of simulation environments that can properly handle the DTDE framework. We discuss current research trends in the DTDE framework.
https://doi.org/10.22648/ETRI.2023.J.380409 인용 PDF

Survey on Communication Algorithms for Multiagent Reinforcement Learning (멀티에이전트 강화학습을 위한 통신 기술 동향)

S.W. Seo;Y.H. Shin;B.H. Yoo;H.W. Kim;H.J. Song;S. Yi
- Electronics and Telecommunications Trends
- /
- v.38 no.4
- /
- pp.104-115
- /
- 2023
Communication for multiagent reinforcement learning (MARL) has emerged to promote understanding of an entire environment. Through communication for MARL, agents can cooperate by choosing the best action considering not only their surrounding environment but also the entire environment and other agents. Hence, MARL with communication may outperform conventional MARL. Many communication algorithms have been proposed to support MARL, but current analyses remain insufficient. This paper presents existing communication algorithms for MARL according to various criteria such as communication methods, contents, and restrictions. In addition, we consider several experimental environments that are primarily used to demonstrate the MARL performance enhanced by communication.
https://doi.org/10.22648/ETRI.2023.J.380410 인용 PDF

An Automatic Cooperative coordination Model for the Multiagent System using Reinforcement Learning (강화학습을 이용한 멀티 에이전트 시스템의 자동 협력 조정 모델)

정보윤;윤소정;오경환
- Korean Journal of Cognitive Science
- /
- v.10 no.1
- /
- pp.1-11
- /
- 1999
Agent-based systems technology has generated lots of excitement in these years because of its promise as a new paradigm for conceptualizing. designing. and l implementing software systems Especially, there has been many researches for multi agent system because of the characteristics that it fits to the distributed and open Internet environments. In a multiagent system. agents must cooperate with each other through a Coordination procedure. when the conflicts between agents arise. where those are caused b by the point that each action acts for a purpose separately without coordination. But P previous researches for coordination methods in multi agent system have a deficiency that they can not solve correctly the cooperation problem between agents which have different goals in dynamic environment. In this paper. we solve the cooperation problem of multiagent that has multiple goals in a dynamic environment. with an automatic cooperative coordination model using I reinforcement learning. We will show the two pursuit problems that we extend a traditional problem in multi agent systems area for modeling the restriction in the multiple goals in a dynamic environment. and we have verified the validity of the proposed model with an experiment.
PDF

Multagent Control Strategy Using Reinforcement Learning (강화학습을 이용한 다중 에이전트 제어 전략)

Lee, Hyong-Ill;Kim, Byung-Cheon
- The KIPS Transactions:PartB
- /
- v.10B no.3
- /
- pp.249-256
- /
- 2003
The most important problems in the multi-agent system are to accomplish a goal through the efficient coordination of several agents and to prevent collision with other agents. In this paper, we propose a new control strategy for succeeding the goal of the prey pursuit problem efficiently. Our control method uses reinforcement learning to control the multi-agent system and consider the distance as well as the space relationship between the agents in the state space of the prey pursuit problem.
https://doi.org/10.3745/KIPSTB.2003.10B.3.249 인용 PDF KSCI

A Survey on Recent Advances in Multi-Agent Reinforcement Learning (멀티 에이전트 강화학습 기술 동향)

Yoo, B.H.;Ningombam, D.D.;Kim, H.W.;Song, H.J.;Park, G.M.;Yi, S.
- Electronics and Telecommunications Trends
- /
- v.35 no.6
- /
- pp.137-149
- /
- 2020
Several multi-agent reinforcement learning (MARL) algorithms have achieved overwhelming results in recent years. They have demonstrated their potential in solving complex problems in the field of real-time strategy online games, robotics, and autonomous vehicles. However these algorithms face many challenges when dealing with massive problem spaces in sparse reward environments. Based on the centralized training and decentralized execution (CTDE) architecture, the MARL algorithms discussed in the literature aim to solve the current challenges by formulating novel concepts of inter-agent modeling, credit assignment, multiagent communication, and the exploration-exploitation dilemma. The fundamental objective of this paper is to deliver a comprehensive survey of existing MARL algorithms based on the problem statements rather than on the technologies. We also discuss several experimental frameworks to provide insight into the use of these algorithms and to motivate some promising directions for future research.
https://doi.org/10.22648/ETRI.2020.J.350614 인용 PDF

SOM-Based State Generalization for Multiagent Reinforcement Learning (다중에이전트 강화학습을 위한 SOM기반의 상태 일한화)

임문택;김인철
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 2002.11a
- /
- pp.399-408
- /
- 2002
다중 에이전트 학습이란 다중 에이전트 환경에서 에이전트간의 조정을 위한 행동전략을 학습하는 것을 말한다. 본 논문에서는 에이전트간의 통신이 불가능한 다중 에이전트 환경에서 각 에이전트들이 서로 독립적으로 대표적인 강화학습법인 Q학습을 전개함으로써 서로 효과적으로 협조할 수 있는 행동전략을 학습하려고 한다. 하지만 단일 에이전트 경우에 비해 보다 큰 상태-행동 공간을 갖는 다중 에이전트환경에서는 강화학습을 통해 효과적으로 최적의 행동 전략에 도달하기 어렵다는 문제점이 있다. 이 문제에 대한 기존의 접근방법은 크게 모듈화 방법과 일반화 방법이 제안되었으나 모두 나름의 제한을 가지고 있다. 본 논문에서는 대표적인 다중 에이전트 학습 문제의 예로서 먹이와 사냥꾼 문제(Prey and Hunters Problem)를 소개하고 이 문제영역을 통해 이와 같은 강화학습의 문제점을 살펴보고, 해결책으로 신경망 SOM을 이용한 일반화 방법인 QSOM 학습법을 제안한다. 이 방법은 기존의 일반화 방법과는 달리 군집화 기능을 제공하는 신경망 SOM을 이용함으로써 명확한 다수의 훈련 예가 없어도 효과적으로 이전에 경험하지 못했던 상태-행동들에 대한 Q값을 예측하고 이용할 수 있다는 장점이 있다. 또한 본 논문에서는 실험을 통해 QSOM 학습법의 일반화 효과와 성능을 평가하였다.
PDF

Search Result 14, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)