Search | Korea Science

C-COMA: A Continual Reinforcement Learning Model for Dynamic Multiagent Environments (C-COMA: 동적 다중 에이전트 환경을 위한 지속적인 강화 학습 모델)

Jung, Kyueyeol;Kim, Incheol
- KIPS Transactions on Software and Data Engineering
- /
- v.10 no.4
- /
- pp.143-152
- /
- 2021
It is very important to learn behavioral policies that allow multiple agents to work together organically for common goals in various real-world applications. In this multi-agent reinforcement learning (MARL) environment, most existing studies have adopted centralized training with decentralized execution (CTDE) methods as in effect standard frameworks. However, this multi-agent reinforcement learning method is difficult to effectively cope with in a dynamic environment in which new environmental changes that are not experienced during training time may constantly occur in real life situations. In order to effectively cope with this dynamic environment, this paper proposes a novel multi-agent reinforcement learning system, C-COMA. C-COMA is a continual learning model that assumes actual situations from the beginning and continuously learns the cooperative behavior policies of agents without dividing the training time and execution time of the agents separately. In this paper, we demonstrate the effectiveness and excellence of the proposed model C-COMA by implementing a dynamic mini-game based on Starcraft II, a representative real-time strategy game, and conducting various experiments using this environment.
https://doi.org/10.3745/KTSDE.2021.10.4.143 인용 PDF KSCI

GENETIC PROGRAMMING OF MULTI-AGENT COOPERATION STRATEGIES FOR TABLE TRANSPORT

Cho, Dong-Yeon;Zhang, Byoung-Tak
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 1998.06a
- /
- pp.170-175
- /
- 1998
Transporting a large table using multiple robotic agents requires at least two group behaviors of homing and herding which are to bo coordinated in a proper sequence. Existing GP methods for multi-agent learning are not practical enough to find an optimal solution in this domain. To evolve this kind of complex cooperative behavior we use a novel method called fitness switching. This method maintains a pool of basis fitness functions each of which corresponds to a primitive group behavior. The basis functions are then progressively combined into more complex fitness functions to co-evolve more complex behavior. The performance of the presented method is compared with that of two conventional methods. Experimental results show that coevolutionary fitness switching provides an effective mechanism for evolving complex emergent behavior which may not be solved by simple genetic programming.
PDF

Consensus-based Cooperative Control for multiple leaders and single follower with interaction nonlinearities (상호작용 비선형성이 있는 다중 리더와 단일 추종자를 위한 일치 기반의 협력 제어)

Tack, Han-Ho;Lim, Young-Hun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.25 no.11
- /
- pp.1663-1669
- /
- 2021
This paper considers the cooperative control problem for multiple leaders and a single follower with interactions. The leaders are controllable, and the follower has interactions with all leaders and is controlled by the interactions. Then, we study the cooperative control problem that achieves the consensus by controlling the leaders. The leaders and the follower are modeled by the single-integrator and the double-integrator, respectively, and it is assumed that the interactions have the nonlinearities. The leaders can estimate the interaction between the follower and exchange the estimated information with neighbors. Then, this paper proposes the consensus-based cooperative control algorithm using the information exchange of the estimated interactions and the virtual velocity variables to achieve the velocity consensus. We analyze the convergence of the agents to the common state based on the Lasalle's Invaraince Principle. Finally, we provide the numerical example to validate the theoretical results.
https://doi.org/10.6109/jkiice.2021.25.11.1663 인용 PDF KSCI

Using Potential Field for Modeling of the Work-environment and Task-sharing on the Multi-agent Cooperative Work

Makino, Tsutomu;Naruse, Keitarou;Yokoi, Hiroshi;Kakazu, Yikinori
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 2001.01a
- /
- pp.37-44
- /
- 2001
This paper describes the modeling of work environment for the extraction of abstract operation rules for cooperative work with multiple agent. We propose the modeling method using a potential field. In the method, it is applied to a box pushing problem, which is to move a box from a start to a goal b multiple agent. The agents follow the potential value when they move and work in the work environment. The work environment is represented as the grid space. The potential field is generated by Genetic Algorithm(GA) for each agent. GA explores the positions of a potential peak value in the grid space, and then the potential value stretching in the grid space is spread by a potential diffusion function in each grid. However it is difficult to explore suitable setting using hand coding of the position of peak potential value. Thus, we use an evlolutionary computation way because it is possible to explore the large search space. So we make experiments the environment modeling using the proposed method and verify the performance of the exploration by GA. And we classify some types from acquired the environment model and extract the abstract operation rule, As results, we find out some types of the environment models and operation rules by the observation, and the performance of GA exploration is almost same as the hand coding set because these are nearly same performance on the evaluation of the consumption of agent's energy and the work step from point to the goal point.
PDF

Making Levels More Challenging with a Cooperative Strategy of Ghosts in Pac-Man (고스트들의 협력전술에 의한 팩맨게임 난이도 제고)

Choi, Taeyeong;Na, Hyeon-Suk
- Journal of Korea Game Society
- /
- v.15 no.5
- /
- pp.89-98
- /
- 2015
The artificial intelligence (AI) of Non-Player Companions (NPC), especially opponents, is a key element to adjust the level of games in game design. Smart opponents can make games more challenging as well as allow players for diverse experiences, even in the same game environment. Since game users interact with more than one opponent in most of today's games, collaboration control of opponent characters becomes more important than ever before. In this paper, we introduce a cooperative strategy based on the A* algorithm for enemies' AI in the Pac-Man game. A survey from 17 human testers shows that the levels with our collaborative opponents are more difficult but interesting than those with either the original Pac-Man's personalities or the non-cooperative greedy opponents.
https://doi.org/10.7583/JKGS.2015.15.5.89 인용 PDF KSCI

Ambient Intelligence in Distributed Modular Systems

Ngo Trung Dung;Lund Henrik Hautop
- Proceedings of the IEEK Conference
- /
- summer
- /
- pp.421-426
- /
- 2004
Analyzing adaptive possibilities of agents in multi-agents system, we have discovered new aspects of ambient intelligence in distributed modular systems using intelligent building blocks (I-BLOCKS) [1]. This paper describes early scientific researches related to technical design, applicable experiments and evaluation of adaptive processing and information interaction among I-BLOCKS allowing users to easily develop ambient intelligence applications. The processing technology presented in this paper is embedded inside each DUPLO1 brick by microprocessor as well as selected sensors and actuators in addition. Behaviors of an I-BLOCKS modular structure are defined by the internal processing functionality of each I-Blocks in such structure and communication capacities between I-BLOCKS. Users of the I-BLOCKS system can do 'programming by building' and thereby create specific functionalities of a modular structure of intelligent artefacts without the need to learn and use traditional programming language. From investigating different effects of modem artificial intelligence, I-BLOCKS we have developed might possibly contain potential possibilities for developing applications in ambient intelligence (AmI) environments. To illustrate these possibilities, the paper presents a range of different experimental scenarios in which I-BLOCKS have been used to set-up reconfigurable modular systems. The paper also reports briefly about earlier experiments of I-BLOCKS in different research fields, allowing users to construct AmI applications by a just defined concept of modular artefacts [3].
PDF

Optimization of Stock Trading System based on Multi-Agent Q-Learning Framework (다중 에이전트 Q-학습 구조에 기반한 주식 매매 시스템의 최적화)

Kim, Yu-Seop;Lee, Jae-Won;Lee, Jong-Woo
- The KIPS Transactions:PartB
- /
- v.11B no.2
- /
- pp.207-212
- /
- 2004
This paper presents a reinforcement learning framework for stock trading systems. Trading system parameters are optimized by Q-learning algorithm and neural networks are adopted for value approximation. In this framework, cooperative multiple agents are used to efficiently integrate global trend prediction and local trading strategy for obtaining better trading performance. Agents Communicate With Others Sharing training episodes and learned policies, while keeping the overall scheme of conventional Q-learning. Experimental results on KOSPI 200 show that a trading system based on the proposed framework outperforms the market average and makes appreciable profits. Furthermore, in view of risk management, the system is superior to a system trained by supervised learning.
https://doi.org/10.3745/KIPSTB.2004.11B.2.207 인용 PDF KSCI

A Creative Solution of Distributed Modular Systems for Building Ubiquitous Heterogeneous Robotic Applications

Ngo Trung Dung;Lund Henrik Hautop
- Proceedings of the IEEK Conference
- /
- summer
- /
- pp.410-415
- /
- 2004
Employing knowledge of adaptive possibilities of agents in multi-agents system, we have explored new aspects of distributed modular systems for building ubiquitous heterogeneous robotic systems using intelligent building blocks (I-BLOCKS) [1] as reconfigurable modules. This paper describes early technological approaches related to technical design, experimental developments and evaluation of adaptive processing and information interaction among I-BLOCKS allowing users to easily develop modular robotic systems. The processing technology presented in this paper is embedded inside each $DUPLO^1$ brick by microprocessor as well as selected sensors and actuators in addition. Behaviors of an I-BLOCKS modular structure are defined by the internal processing functionality of each I-Block in such structure and communication capacities between I-BLOCKS. Users of the I-BLOCKS system can easily do 'programming by building' and thereby create specific functionalities of a modular robotic structure of intelligent artefacts without the need to learn and use traditional programming language. From investigating different effects of modern artificial intelligence, I-BLOCKS we have developed might possibly contain potential possibilities for developing modular robotic system with different types of morphology, functionality and behavior. To assess these potential I-BLOCKS possibilities, the paper presents a limited range of different experimental scenarios in which I-BLOCKS have been used to set-up reconfigurable modular robots. The paper also reports briefly about earlier experiments of I-BLOCKS created on users' natural inspiration by a just defined concept of modular artefacts.
PDF

Adaptive Migration Path Technique of Mobile Agent Using the Metadata of Naming Agent (네이밍 에이전트의 메타데이터를 이용한 이동 에이전트의 적응적 이주 경로 기법)

Kim, Kwang-Jong;Ko, Hyun;Lee, Yon-Sik
- Journal of the Korea Society of Computer and Information
- /
- v.12 no.3
- /
- pp.165-175
- /
- 2007
The mobile agent executes a given task by which the agent code moves to the server directly. Therefore, node migration method becomes an important factor which impact on the whole performance of distributed system. In this paper, we propose an adaptive migration path technique of mobile agent using the metadata of naming agent. In this proposed technique, node selection for migration depends on the content of referenced metadata, and the reliability of migrated information is determined by the metadata updating method and cooperative operations of individual agents in multi-agents system. For these, we design the metadata using by the number of hit documents, hit ratio, node processing time and network delay time, and describe the methods for creating, using and updating metadata for which determine the adaptive node migration path of mobile agent according to the cooperation of individual agents and number of hit documents using by designed metadata. And results of evaluated performance for proposed adaptive migration path technique through the proper experiment and analysis gain rate of high effective information earning, because of high hit ratio(72%) about of fathered documents by case of applying metadata move to the 13 nodes. But, in case of non-applying metadata is hit ratio(46%) of gathered documents and rate of effective information earning about of 26 nodes is 36.8%.
PDF

Distributed Intrusion Detection System for Safe E-Business Model (안전한 E-Business 모델을 위한 분산 침입 탐지 시스템)

이기준;정채영
- Journal of Internet Computing and Services
- /
- v.2 no.4
- /
- pp.41-53
- /
- 2001
Multi-distributed web cluster model built for high availability E-Business model exposes internal system nodes on its structural characteristics and has a potential that normal job performance is impossible due to the intentional prevention and attack by an illegal third party. Therefore, the security system which protects the structured system nodes and can correspond to the outflow of information from illegal users and unfair service requirements effectively is needed. Therefore the suggested distributed invasion detection system is the technology which detects the illegal requirement or resource access of system node distributed on open network through organic control between SC-Agents based on the shared memory of SC-Server. Distributed invasion detection system performs the examination of job requirement packet using Detection Agent primarily for detecting illegal invasion, observes the job process through monitoring agent when job is progressed and then judges the invasion through close cooperative works with other system nodes when there is access or demand of resource not permitted.
PDF

Search Result 32, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)