• Title/Summary/Keyword: 강화 학습 에이전트

Search Result 132, Processing Time 0.027 seconds

Flight Trajectory Simulation via Reinforcement Learning in Virtual Environment (가상 환경에서의 강화학습을 이용한 비행궤적 시뮬레이션)

  • Lee, Jae-Hoon;Kim, Tae-Rim;Song, Jong-Gyu;Im, Hyun-Jae
    • Journal of the Korea Society for Simulation
    • /
    • v.27 no.4
    • /
    • pp.1-8
    • /
    • 2018
  • The most common way to control a target point using artificial intelligence is through reinforcement learning. However, it had to process complicated calculations that were difficult to implement in order to process reinforcement learning. In this paper, the enhanced Proximal Policy Optimization (PPO) algorithm was used to simulate finding the planned flight trajectory to reach the target point in the virtual environment. In this paper, we simulated how this problem was used to find the planned flight trajectory to reach the target point in the virtual environment using the enhanced Proximal Policy Optimization(PPO) algorithm. In addition, variables such as changes in trajectory, effects of rewards, and external winds are added to determine the zero conditions of external environmental factors on flight trajectory learning, and the effects on trajectory learning performance and learning speed are compared. From this result, the simulation results have shown that the agent can find the optimal trajectory in spite of changes in the various external environments, which will be applicable to the actual vehicle.

A Routing Algorithm based on Deep Reinforcement Learning in SDN (SDN에서 심층강화학습 기반 라우팅 알고리즘)

  • Lee, Sung-Keun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1153-1160
    • /
    • 2021
  • This paper proposes a routing algorithm that determines the optimal path using deep reinforcement learning in software-defined networks. The deep reinforcement learning model for learning is based on DQN, the inputs are the current network state, source, and destination nodes, and the output returns a list of routes from source to destination. The routing task is defined as a discrete control problem, and the quality of service parameters for routing consider delay, bandwidth, and loss rate. The routing agent classifies the appropriate service class according to the user's quality of service profile, and converts the service class that can be provided for each link from the current network state collected from the SDN. Based on this converted information, it learns to select a route that satisfies the required service level from the source to the destination. The simulation results indicated that if the proposed algorithm proceeds with a certain episode, the correct path is selected and the learning is successfully performed.

Learning Multi-Character Competition in Markov Games (마르코프 게임 학습에 기초한 다수 캐릭터의 경쟁적 상호작용 애니메이션 합성)

  • Lee, Kang-Hoon
    • Journal of the Korea Computer Graphics Society
    • /
    • v.15 no.2
    • /
    • pp.9-17
    • /
    • 2009
  • Animating multiple characters to compete with each other is an important problem in computer games and animation films. However, it remains difficult to simulate strategic competition among characters because of its inherent complex decision process that should be able to cope with often unpredictable behavior of opponents. We adopt a reinforcement learning method in Markov games to action models built from captured motion data. This enables two characters to perform globally optimal counter-strategies with respect to each other. We also extend this method to simulate competition between two teams, each of which can consist of an arbitrary number of characters. We demonstrate the usefulness of our approach through various competitive scenarios, including playing-tag, keeping-distance, and shooting.

  • PDF

Fast Navigation in Dynamic 3D Game Environment Using Reinforcement Learning (강화 학습을 사용한 동적 게임 환경에서의 빠른 경로 탐색)

  • Yi, Seung-Joon;Zhang, Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.703-705
    • /
    • 2005
  • 연속적이고 동적인 실세계에서의 경로 탐색 문제는 이동 로봇 분야에서 주된 문제 중 하나였다. 최근 컴퓨터 성능이 크게 발전하면서 컴퓨터 게임들이 실제에 가까운 연속적인 3차원 환경 모델을 사용하기 시작하였고, 그에 따라 보다 복잡하고 동적인 환경 모델 하에서 경로 탐색을 할 수 있는 능력이 요구되고 있다. 강화 학습 기반의 경로 탐색 알고리즘인 평가치 반복(Value iteration) 알고리즘은 실시간 멀티에이전트 환경에 적합한 여러 장점들을 가지고 있으나, 문제가 커질수록 속도가 크게 느려진다는 단점을 가지고 있다. 본 논문에서는 연속적인 3차원 상황에서 빠르게 동적 변화에 적응할 수 있도록 하기 위하여 작은 세상 네트웍 모델을 사용한 환경 모델 및 경로 탐색 알고리즘을 제안한다. 3차원 게임 환경에서의 실험을 통해 제안된 알고리즘이 연속적이고 복잡한 실시간 환경 하에서 우수한 경로를 찾아낼 수 있으며, 환경의 변화가 관측될 경우 이에 빠르게 적응할 수 있음을 확인할 수 있었다.

  • PDF

Prediction Technique of Energy Consumption based on Reinforcement Learning in Microgrids (마이크로그리드에서 강화학습 기반 에너지 사용량 예측 기법)

  • Sun, Young-Ghyu;Lee, Jiyoung;Kim, Soo-Hyun;Kim, Soohwan;Lee, Heung-Jae;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.3
    • /
    • pp.175-181
    • /
    • 2021
  • This paper analyzes the artificial intelligence-based approach for short-term energy consumption prediction. In this paper, we employ the reinforcement learning algorithms to improve the limitation of the supervised learning algorithms which usually utilize to the short-term energy consumption prediction technologies. The supervised learning algorithm-based approaches have high complexity because the approaches require contextual information as well as energy consumption data for sufficient performance. We propose a deep reinforcement learning algorithm based on multi-agent to predict energy consumption only with energy consumption data for improving the complexity of data and learning models. The proposed scheme is simulated using public energy consumption data and confirmed the performance. The proposed scheme can predict a similar value to the actual value except for the outlier data.

Exploring the Effectiveness of GAN-based Approach and Reinforcement Learning in Character Boxing Task (캐릭터 복싱 과제에서 GAN 기반 접근법과 강화학습의 효과성 탐구)

  • Seoyoung Son;Taesoo Kwon
    • Journal of the Korea Computer Graphics Society
    • /
    • v.29 no.4
    • /
    • pp.7-16
    • /
    • 2023
  • For decades, creating a desired locomotive motion in a goal-oriented manner has been a challenge in character animation. Data-driven methods using generative models have demonstrated efficient ways of predicting long sequences of motions without the need for explicit conditioning. While these methods produce high-quality long-term motions, they can be limited when it comes to synthesizing motion for challenging novel scenarios, such as punching a random target. A state-of-the-art solution to overcome this limitation is by using a GAN Discriminator to imitate motion data clips and incorporating reinforcement learning to compose goal-oriented motions. In this paper, our research aims to create characters performing combat sports such as boxing, using a novel reward design in conjunction with existing GAN-based approaches. We experimentally demonstrate that both the Adversarial Motion Prior [3] and Adversarial Skill Embeddings [4] methods are capable of generating viable motions for a character punching a random target, even in the absence of mocap data that specifically captures the transition between punching and locomotion. Also, with a single learned policy, multiple task controllers can be constructed through the TimeChamber framework.

Comparative Analysis of Battery Optimization inGrid Considering Consumption Patterns (소비 패턴을 고려한 그리드 환경에서의 배터리 최적화 비교 분석)

  • Hajin Noh;Yujin Lim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.549-552
    • /
    • 2023
  • 현재 전력망에서는 불규칙하거나 낭비되는 전력 문제를 해결하기 위한 한 방법으로 ESS(Energy Storage System)를 활용하는 방법이 많은 관심을 받고 있다. 본 연구에서는 업종별로 시간대에 따라 요금을 다르게 부과하는 배전망 시스템에서, 배터리를 보다 경제적으로 사용하는 동시에 여유 용량을 유지하도록 하는 DQN 기반 강화학습 기법을 제안하였다. 또한, 업종별로 다른 전력 소비 패턴을 에이전트의 동작성과 함께 그 성능을 분석하고 비교하였다.

Research Trends of Multi-agent Collaboration Technology for Artificial Intelligence Bots (AI Bots를 위한 멀티에이전트 협업 기술 동향)

  • D., Kang;J.Y., Jung;C.H., Lee;M., Park;J.W., Lee;Y.J., Lee
    • Electronics and Telecommunications Trends
    • /
    • v.37 no.6
    • /
    • pp.32-42
    • /
    • 2022
  • Recently, decentralized approaches to artificial intelligence (AI) development, such as federated learning are drawing attention as AI development's cost and time inefficiency increase due to explosive data growth and rapid environmental changes. Collaborative AI technology that dynamically organizes collaborative groups between different agents to share data, knowledge, and experience and uses distributed resources to derive enhanced knowledge and analysis models through collaborative learning to solve given problems is an alternative to centralized AI. This article investigates and analyzes recent technologies and applications applicable to the research of multi-agent collaboration of AI bots, which can provide collaborative AI functionality autonomously.

A Study on the Development of Adversarial Simulator for Network Vulnerability Analysis Based on Reinforcement Learning (강화학습 기반 네트워크 취약점 분석을 위한 적대적 시뮬레이터 개발 연구)

  • Jeongyoon Kim; Jongyoul Park;Sang Ho Oh
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.1
    • /
    • pp.21-29
    • /
    • 2024
  • With the development of ICT and network, security management of IT infrastructure that has grown in size is becoming very difficult. Many companies and public institutions are having difficulty managing system and network security. In addition, as the complexity of hardware and software grows, it is becoming almost impossible for a person to manage all security. Therefore, AI is essential for network security management. However, since it is very dangerous to operate an attack model in a real network environment, cybersecurity emulation research was conducted through reinforcement learning by implementing a real-life network environment. To this end, this study applied reinforcement learning to the network environment, and as the learning progressed, the agent accurately identified the vulnerability of the network. When a network vulnerability is detected through AI, automated customized response becomes possible.

Collision Avoidance Path Control of Multi-AGV Using Multi-Agent Reinforcement Learning (다중 에이전트 강화학습을 이용한 다중 AGV의 충돌 회피 경로 제어)

  • Choi, Ho-Bin;Kim, Ju-Bong;Han, Youn-Hee;Oh, Se-Won;Kim, Kwi-Hoon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.9
    • /
    • pp.281-288
    • /
    • 2022
  • AGVs are often used in industrial applications to transport heavy materials around a large industrial building, such as factories or warehouses. In particular, in fulfillment centers their usefulness is maximized for automation. To increase productivity in warehouses such as fulfillment centers, sophisticated path planning of AGVs is required. We propose a scheme that can be applied to QMIX, a popular cooperative MARL algorithm. The performance was measured with three metrics in several fulfillment center layouts, and the results are presented through comparison with the performance of the existing QMIX. Additionally, we visualize the transport paths of trained AGVs for a visible analysis of the behavior patterns of the AGVs as heat maps.