• Title/Summary/Keyword: Unity ML-Agents

Search Result 11, Processing Time 0.025 seconds

Implementation of Target Object Tracking Method using Unity ML-Agent Toolkit (Unity ML-Agents Toolkit을 활용한 대상 객체 추적 머신러닝 구현)

  • Han, Seok Ho;Lee, Yong-Hwan
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.3
    • /
    • pp.110-113
    • /
    • 2022
  • Non-playable game character plays an important role in improving the concentration of the game and the interest of the user, and recently implementation of NPC with reinforcement learning has been in the spotlight. In this paper, we estimate an AI target tracking method via reinforcement learning, and implement an AI-based tracking agency of specific target object with avoiding traps through Unity ML-Agents Toolkit. The implementation is built in Unity game engine, and simulations are conducted through a number of experiments. The experimental results show that outstanding performance of the tracking target with avoiding traps is shown with good enough results.

Design and Implementation of Reinforcement Learning Environment Using Unity 3D-based ML-Agents Toolkit (Unity 3D 기반 ML-Agents Toolkit을 이용한 강화 학습 환경 설계 및 구현)

  • Choi, Ho-Bin;Kim, Chan-Myung;Kim, Ju-Bong;Han, Youn-Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.548-551
    • /
    • 2019
  • 강화 학습은 일반적으로 제어 로봇과 관련이 있는 순차적 의사결정을 위한 학습의 한 형태이다. 이 강화 학습은 행동에 대한 보상을 최대로 하는 정책을 학습하는 것을 목표로 한다. 하지만, 강화 학습을 실제 세계에 적용하기에는 많은 제약사항이 존재하며 실제 세계의 복잡한 환경에서 좋은 정책을 학습하는 것은 매우 어렵다. Unity는 강화 학습 시뮬레이션을 위한 전용 Toolkit을 제공한다. 이러한 이유로 Unity를 시뮬레이터로서 사용하는 것이 좋은 정책을 학습하는 훈련의 근거가 된다. 따라서 본 논문에서는 강화 학습을 실제 세계에 바로 적용시키기 전에 Unity Machine Learning Agents Toolkit을 사용하여 실제 세계와 비슷한 환경을 만들고 강화 학습을 통해 에이전트를 미리 학습시켜보는 과정을 수행해봄으로써 시뮬레이터의 필요성을 부각시킨다.

Implementation of Intelligent Agent Based on Reinforcement Learning Using Unity ML-Agents (유니티 ML-Agents를 이용한 강화 학습 기반의 지능형 에이전트 구현)

  • Young-Ho Lee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.2
    • /
    • pp.205-211
    • /
    • 2024
  • The purpose of this study is to implement an agent that intelligently performs tracking and movement through reinforcement learning using the Unity and ML-Agents. In this study, we conducted an experiment to compare the learning performance between training one agent in a single learning simulation environment and parallel training of several agents simultaneously in a multi-learning simulation environment. From the experimental results, we could be confirmed that the parallel training method is about 4.9 times faster than the single training method in terms of learning speed, and more stable and effective learning occurs in terms of learning stability.

Design and Implementation of Reinforcement Learning Agent Using PPO Algorithim for Match 3 Gameplay (매치 3 게임 플레이를 위한 PPO 알고리즘을 이용한 강화학습 에이전트의 설계 및 구현)

  • Park, Dae-Geun;Lee, Wan-Bok
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.3
    • /
    • pp.1-6
    • /
    • 2021
  • Most of the match-3 puzzle games supports automatic play using the MCTS algorithm. However, implementing reinforcement learning agents is not an easy job because it requires both the knowledge of machine learning and the way of complex interactions within the development environment. This study proposes a method in which we can easily design reinforcement learning agents and implement game play agents by applying PPO(Proximal Policy Optimization) algorithms. And we could identify the performance was increased about 44% than the conventional method. The tools we used are the Unity 3D game engine and Unity ML SDK. The experimental result shows that agents became to learn game rules and make better strategic decisions as experiments go on. On average, the puzzle gameplay agents implemented in this study played puzzle games better than normal people. It is expected that the designed agent could be used to speed up the game level design process.

Design of track path-finding simulation using Unity ML Agents

  • In-Chul Han;Jin-Woong Kim;Soo Kyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.2
    • /
    • pp.61-66
    • /
    • 2024
  • This paper aims to design a simulation for path-finding of objects in a simulation or game environment using reinforcement learning techniques. The main feature of this study is that the objects in the simulation are trained to avoid obstacles at random locations generated on a given track and to automatically explore path to get items. To implement the simulation, ML Agents provided by Unity Game Engine were used, and a learning policy based on PPO (Proximal Policy Optimization) was established to form a reinforcement learning environment. Through the reinforcement learning-based simulation designed in this study, we were able to confirm that the object moves on the track by avoiding obstacles and exploring path to acquire items as it learns, by analyzing the simulation results and learning result graph.

Designing a Reinforcement Learning-Based 3D Object Reconstruction Data Acquisition Simulation (강화학습 기반 3D 객체복원 데이터 획득 시뮬레이션 설계)

  • Young-Hoon Jin
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.6
    • /
    • pp.11-16
    • /
    • 2023
  • The technology of 3D reconstruction, primarily relying on point cloud data, is essential for digitizing objects or spaces. This paper aims to utilize reinforcement learning to achieve the acquisition of point clouds in a given environment. To accomplish this, a simulation environment is constructed using Unity, and reinforcement learning is implemented using the Unity package known as ML-Agents. The process of point cloud acquisition involves initially setting a goal and calculating a traversable path around the goal. The traversal path is segmented at regular intervals, with rewards assigned at each step. To prevent the agent from deviating from the path, rewards are increased. Additionally, rewards are granted each time the agent fixates on the goal during traversal, facilitating the learning of optimal points for point cloud acquisition at each traversal step. Experimental results demonstrate that despite the variability in traversal paths, the approach enables the acquisition of relatively accurate point clouds.

Applying Model to Real World through Robot Reinforcement Learning in Unity3D (Unity3D 가상 환경에서 강화학습으로 만들어진 모델의 효율적인 실세계 적용)

  • Lim, En-A;Kim, Na-Young;Lee, Jong-lark;Weon, Ill-yong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.800-803
    • /
    • 2020
  • 실 환경 로봇에 강화학습을 적용하기 위해서는 가상 환경 시뮬레이션이 필요하다. 그러나 가상 환경을 구축하는 플랫폼은 모두 다르고, 학습 알고리즘의 구현에 따른 성능 편차가 크다는 문제점이 있다. 또한 학습을 적용하고자 하는 대상이 실세계의 하드웨어 사양이 낮은 스마트 로봇인 경우, 계산량이 많은 학습 알고리즘을 적용하기는 쉽지 않다. 본 연구는 해당 문제를 해결하기 위해 Unity3D에서 제공하는 강화학습 프레임인 ML-Agents 모듈을 사용하여 실 환경의 저사양 스마트 로봇에 장애물을 회피하고 탐색하는 모델의 강화학습을 적용해본다. 본 연구의 유의점은 가상 환경과 실 환경의 유사함과 일정량의 노이즈 발생 처리이다. 로봇의 간단한 행동은 원만하게 학습 및 적용가능함을 확인할 수 있었다.

Research on Optimal Deployment of Sonobuoy for Autonomous Aerial Vehicles Using Virtual Environment and DDPG Algorithm (가상환경과 DDPG 알고리즘을 이용한 자율 비행체의 소노부이 최적 배치 연구)

  • Kim, Jong-In;Han, Min-Seok
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.15 no.2
    • /
    • pp.152-163
    • /
    • 2022
  • In this paper, we present a method to enable an unmanned aerial vehicle to drop the sonobuoy, an essential element of anti-submarine warfare, in an optimal deployment. To this end, an environment simulating the distribution of sound detection performance was configured through the Unity game engine, and the environment directly configured using Unity ML-Agents and the reinforcement learning algorithm written in Python from the outside communicated with each other and learned. In particular, reinforcement learning is introduced to prevent the accumulation of wrong actions and affect learning, and to secure the maximum detection area for the sonobuoy while the vehicle flies to the target point in the shortest time. The optimal placement of the sonobuoy was achieved by applying the Deep Deterministic Policy Gradient (DDPG) algorithm. As a result of the learning, the agent flew through the sea area and passed only the points to achieve the optimal placement among the 70 target candidates. This means that an autonomous aerial vehicle that deploys a sonobuoy in the shortest time and maximum detection area, which is the requirement for optimal placement, has been implemented.

A Study about the Usefulness of Reinforcement Learning in Business Simulation Games using PPO Algorithm (경영 시뮬레이션 게임에서 PPO 알고리즘을 적용한 강화학습의 유용성에 관한 연구)

  • Liang, Yi-Hong;Kang, Sin-Jin;Cho, Sung Hyun
    • Journal of Korea Game Society
    • /
    • v.19 no.6
    • /
    • pp.61-70
    • /
    • 2019
  • In this paper, we apply reinforcement learning in the field of management simulation game to check whether game agents achieve autonomously given goal. In this system, we apply PPO (Proximal Policy Optimization) algorithm in the Unity Machine Learning (ML) Agent environment and the game agent is designed to automatically find a way to play. Five game scenario simulation experiments were conducted to verify their usefulness. As a result, it was confirmed that the game agent achieves the goal through learning despite the change of environment variables in the game.

Proximal Policy Optimization Reinforcement Learning based Optimal Path Planning Study of Surion Agent against Enemy Air Defense Threats (근접 정책 최적화 기반의 적 대공 방어 위협하 수리온 에이전트의 최적 기동경로 도출 연구)

  • Jae-Hwan Kim;Jong-Hwan Kim
    • Journal of the Korea Society for Simulation
    • /
    • v.33 no.2
    • /
    • pp.37-44
    • /
    • 2024
  • The Korean Helicopter Development Program has successfully introduced the Surion helicopter, a versatile multi-domain operational aircraft that replaces the aging UH-1 and 500MD helicopters. Specifically designed for maneuverability, the Surion plays a crucial role in low-altitude tactical maneuvers for personnel transportation and specific missions, emphasizing the helicopter's survivability. Despite the significance of its low-altitude tactical maneuver capability, there is a notable gap in research focusing on multi-mission tactical maneuvers that consider the risk factors associated with deploying the Surion in the presence of enemy air defenses. This study addresses this gap by exploring a method to enhance the Surion's low-altitude maneuvering paths, incorporating information about enemy air defenses. Leveraging the Proximal Policy Optimization (PPO) algorithm, a reinforcement learning-based approach, the research aims to optimize the helicopter's path planning. Visualized experiments were conducted using a Surion model implemented in the Unity environment and ML-Agents library. The proposed method resulted in a rapid and stable policy convergence for generating optimal maneuvering paths for the Surion. The experiments, based on two key criteria, "operation time" and "minimum damage," revealed distinct optimal paths. This divergence suggests the potential for effective tactical maneuvers in low-altitude situations, considering the risk factors associated with enemy air defenses. Importantly, the Surion's capability for remote control in all directions enhances its adaptability in complex operational environments.