• Title/Summary/Keyword: Learning Agent

Search Result 448, Processing Time 0.028 seconds

Earthwork Planning via Reinforcement Learning with Heterogeneous Construction Equipment (강화학습을 이용한 이종 장비 토목 공정 계획)

  • Ji, Min-Gi;Park, Jun-Keon;Kim, Do-Hyeong;Jung, Yo-Han;Park, Jin-Kyoo;Moon, Il-Chul
    • Journal of the Korea Society for Simulation
    • /
    • v.27 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • Earthwork planning is one of the critical issues in a construction process management. For the construction process management, there are some different approaches such as optimizing construction with either mathematical methodologies or heuristics with simulations. This paper propose a simulated earthwork scenario and an optimal path for the simulation using a reinforcement learning. For reinforcement learning, we use two different Markov decision process, or MDP, formulations with interacting excavator agent and truck agent, sequenced learning, and independent learning. The simulation result shows that two different formulations can reach the optimal planning for a simulated earthwork scenario. This planning could be a basis for an automatic construction management.

C-COMA: A Continual Reinforcement Learning Model for Dynamic Multiagent Environments (C-COMA: 동적 다중 에이전트 환경을 위한 지속적인 강화 학습 모델)

  • Jung, Kyueyeol;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.4
    • /
    • pp.143-152
    • /
    • 2021
  • It is very important to learn behavioral policies that allow multiple agents to work together organically for common goals in various real-world applications. In this multi-agent reinforcement learning (MARL) environment, most existing studies have adopted centralized training with decentralized execution (CTDE) methods as in effect standard frameworks. However, this multi-agent reinforcement learning method is difficult to effectively cope with in a dynamic environment in which new environmental changes that are not experienced during training time may constantly occur in real life situations. In order to effectively cope with this dynamic environment, this paper proposes a novel multi-agent reinforcement learning system, C-COMA. C-COMA is a continual learning model that assumes actual situations from the beginning and continuously learns the cooperative behavior policies of agents without dividing the training time and execution time of the agents separately. In this paper, we demonstrate the effectiveness and excellence of the proposed model C-COMA by implementing a dynamic mini-game based on Starcraft II, a representative real-time strategy game, and conducting various experiments using this environment.

User Profile based Personalized Web Agent (사용자 프로파일 기반 개인 웹 에이전트)

  • So, Young-Jun;Park, Young-Tack
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.3
    • /
    • pp.248-256
    • /
    • 2000
  • This paper presents a personalized web agent that constructs user profile which consists of user preferences on the web and recommends his/her relevant information to the user. The personalized web agent consists of monitor agent, user profile construction agent, and user profile refinement agent. The monitor agent makes a user describe his/her preferences directly and it creates the database of preference document, finally performs several keyword extraction to increase the accuracy of the DB. The user profile construction agent transforms the extracted keywords into user profile that could be confirmed and edited by the user. and the refinement agent refines user profile by recursively learning and processing user feedback. In this paper, we describe the several keyword weighting and inductive learning techniques in detail. Finally, we describe the adaptive web retrieval and push agent that perform adaptive services to the user.

  • PDF

Goal-Directed Reinforcement Learning System (목표지향적 강화학습 시스템)

  • Lee, Chang-Hoon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.5
    • /
    • pp.265-270
    • /
    • 2010
  • Reinforcement learning performs learning through interacting with trial-and-error in dynamic environment. Therefore, in dynamic environment, reinforcement learning method like TD-learning and TD(${\lambda}$)-learning are faster in learning than the conventional stochastic learning method. However, because many of the proposed reinforcement learning algorithms are given the reinforcement value only when the learning agent has reached its goal state, most of the reinforcement algorithms converge to the optimal solution too slowly. In this paper, we present GDRLS algorithm for finding the shortest path faster in a maze environment. GDRLS is select the candidate states that can guide the shortest path in maze environment, and learn only the candidate states to find the shortest path. Through experiments, we can see that GDRLS can search the shortest path faster than TD-learning and TD(${\lambda}$)-learning in maze environment.

AR Gardening system with an interactive learning companion (AR Gardening : 상호작용형 증강 에이전트 기반 증강 원예 체험 시스템)

  • Oh, Se-Jin;Woo, Woon-Tack
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.168-173
    • /
    • 2008
  • Recently, many researchers have studied on agent-based edutainment systems to improve students' learning experience. In this paper, we present AR Gardening which makes users experience interactive flower gardening with a bluebird, a learning companion agent, squatting in an augmented picture. The proposed system augments the animated bluebird to support interactive edutainment experiences. The bluebird perceives users' actions as well as environmental situations. It then appraises situational information to provide participants with problem-solving guidelines. Moreover, the bluebird responds like a companion than an instructor through anthropomorphic expression. To demonstrate our work, we exhibited the implemented AR Gardening and reviewed participants' responses to the system. In this exhibition, we could find that the learning companion-like bluebird helped users experience how to properly grow the flower in our educational setting. Ultimately, we expect that an augmented peer learning agent is one of the key factors for developing effective edutainment applications.

  • PDF

A Rule's Reasoning and Case-Based Learning Method for Efficient Dynamic Workload Balancing of VoD Systems (VoD 시스템의 효율적인 동적 작업부하조정을 위한 규칙 추론 및 사례기반 학습 방법)

  • Kim, Joong Hwan;Park, Jeong Yun
    • The Journal of Korean Association of Computer Education
    • /
    • v.11 no.2
    • /
    • pp.107-117
    • /
    • 2008
  • The agent system that can adjust the workload dynamically through thc periodical monitoring of the VoD system comprises the agency part interfacing the VoD system and the intelligence part reasoning or learning the facts required for the adjustment of workload. This paper proposes a learning method that can apply to the intelligence part of the agent system. The proposed method can adjust the workload more efficiently by the rule's reasoning process and case-based learning process. An experiment of implementing a simulator was conducted to see whether or not application of the proposed method to VoD systems is efficient. As a result of the experiment, it was found that the throughput and the average waiting time of the VoD server were relatively improved when the proposed method was applied compared to existing means.

  • PDF

QLGR: A Q-learning-based Geographic FANET Routing Algorithm Based on Multi-agent Reinforcement Learning

  • Qiu, Xiulin;Xie, Yongsheng;Wang, Yinyin;Ye, Lei;Yang, Yuwang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.11
    • /
    • pp.4244-4274
    • /
    • 2021
  • The utilization of UAVs in various fields has led to the development of flying ad hoc network (FANET) technology. In a network environment with highly dynamic topology and frequent link changes, the traditional routing technology of FANET cannot satisfy the new communication demands. Traditional routing algorithm, based on geographic location, can "fall" into a routing hole. In view of this problem, we propose a geolocation routing protocol based on multi-agent reinforcement learning, which decreases the packet loss rate and routing cost of the routing protocol. The protocol views each node as an intelligent agent and evaluates the value of its neighbor nodes through the local information. In the value function, nodes consider information such as link quality, residual energy and queue length, which reduces the possibility of a routing hole. The protocol uses global rewards to enable individual nodes to collaborate in transmitting data. The performance of the protocol is experimentally analyzed for UAVs under extreme conditions such as topology changes and energy constraints. Simulation results show that our proposed QLGR-S protocol has advantages in performance parameters such as throughput, end-to-end delay, and energy consumption compared with the traditional GPSR protocol. QLGR-S provides more reliable connectivity for UAV networking technology, safeguards the communication requirements between UAVs, and further promotes the development of UAV technology.

Study for Feature Selection Based on Multi-Agent Reinforcement Learning (다중 에이전트 강화학습 기반 특징 선택에 대한 연구)

  • Kim, Miin-Woo;Bae, Jin-Hee;Wang, Bo-Hyun;Lim, Joon-Shik
    • Journal of Digital Convergence
    • /
    • v.19 no.12
    • /
    • pp.347-352
    • /
    • 2021
  • In this paper, we propose a method for finding feature subsets that are effective for classification in an input dataset by using a multi-agent reinforcement learning method. In the field of machine learning, it is crucial to find features suitable for classification. A dataset may have numerous features; while some features may be effective for classification or prediction, others may have little or rather negative effects on results. In machine learning problems, feature selection for increasing classification or prediction accuracy is a critical problem. To solve this problem, we proposed a feature selection method based on reinforced learning. Each feature has one agent, which determines whether the feature is selected. After obtaining corresponding rewards for each feature that is selected, but not by the agents, the Q-value of each agent is updated by comparing the rewards. The reward comparison of the two subsets helps agents determine whether their actions were right. These processes are performed as many times as the number of episodes, and finally, features are selected. As a result of applying this method to the Wisconsin Breast Cancer, Spambase, Musk, and Colon Cancer datasets, accuracy improvements of 0.0385, 0.0904, 0.1252 and 0.2055 were shown, respectively, and finally, classification accuracies of 0.9789, 0.9311, 0.9691 and 0.9474 were achieved, respectively. It was proved that our proposed method could properly select features that were effective for classification and increase classification accuracy.

Design and Implementation of a Customized Courseware using Agents (에이전트를 이용한 맞춤형 코스웨어의 설계 및 구현)

  • Heo, Sun-Young;Kim, Eun-Gyung
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.473-480
    • /
    • 2006
  • Recently, remote education systems for web-based teaching-studying are rapidly increased. Also, a request for customized courseware suitable for individual learner's level and learning pattern is increasing. But, most remote education systems do not provide customized learning service fit for each learner's level and lots of learners easily lose their interest in studying. Therefore, a lot of researchers have tried to provide personalized customized learning service by analyzing leaner's level and learning pattern automatically with agents. In this paper, we designed and implemented a customized courseware for studying the computer application. There are four agents such as professor, assistant, student, and monitor agent in CCA and they cooperate with each other to provide learning contents suited to each learner's level.

Application of Multi-agent Reinforcement Learning to CELSS Material Circulation Control

  • Hirosaki, Tomofumi;Yamauchi, Nao;Yoshida, Hiroaki;Ishikawa, Yoshio;Miyajima, Hiroyuki
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.145-150
    • /
    • 2001
  • A Controlled Ecological Life Support System(CELSS) is essential for man to live a long time in a closed space such as a lunar base or a mars base. Such a system may be an extremely complex system that has a lot of facilities and circulates multiple substances,. Therefore, it is very difficult task to control the whole CELSS. Thus by regarding facilities constituting the CELSS as agents and regarding the status and action as information, the whole CELSS can be treated as multi-agent system(MAS). If a CELSS can be regarded as MAS the CELSS can have three advantages with the MAS. First the MAS need not have a central computer. Second the expendability of the CELSS increases. Third, its fault tolerance rises. However it is difficult to describe the cooperation protocol among agents for MAS. Therefore in this study we propose to apply reinforcement learning (RL), because RL enables and agent to acquire a control rule automatically. To prove that MAS and RL are effective methods. we have created the system in Java, which easily gives a distributed environment that is the characteristics feature of an agent. In this paper, we report the simulation results for material circulation control of the CELSS by the MAS and RL.

  • PDF