Search | Korea Science

Collective Navigation Through a Narrow Gap for a Swarm of UAVs Using Curriculum-Based Deep Reinforcement Learning (커리큘럼 기반 심층 강화학습을 이용한 좁은 틈을 통과하는 무인기 군집 내비게이션)

Myong-Yol Choi;Woojae Shin;Minwoo Kim;Hwi-Sung Park;Youngbin You;Min Lee;Hyondong Oh
- The Journal of Korea Robotics Society
- /
- v.19 no.1
- /
- pp.117-129
- /
- 2024
This paper introduces collective navigation through a narrow gap using a curriculum-based deep reinforcement learning algorithm for a swarm of unmanned aerial vehicles (UAVs). Collective navigation in complex environments is essential for various applications such as search and rescue, environment monitoring and military tasks operations. Conventional methods, which are easily interpretable from an engineering perspective, divide the navigation tasks into mapping, planning, and control; however, they struggle with increased latency and unmodeled environmental factors. Recently, learning-based methods have addressed these problems by employing the end-to-end framework with neural networks. Nonetheless, most existing learning-based approaches face challenges in complex scenarios particularly for navigating through a narrow gap or when a leader or informed UAV is unavailable. Our approach uses the information of a certain number of nearest neighboring UAVs and incorporates a task-specific curriculum to reduce learning time and train a robust model. The effectiveness of the proposed algorithm is verified through an ablation study and quantitative metrics. Simulation results demonstrate that our approach outperforms existing methods.
https://doi.org/10.7746/jkros.2024.19.1.117 인용 PDF

Application of Deep Recurrent Q Network with Dueling Architecture for Optimal Sepsis Treatment Policy

Do, Thanh-Cong;Yang, Hyung Jeong;Ho, Ngoc-Huynh
- Smart Media Journal
- /
- v.10 no.2
- /
- pp.48-54
- /
- 2021
Sepsis is one of the leading causes of mortality globally, and it costs billions of dollars annually. However, treating septic patients is currently highly challenging, and more research is needed into a general treatment method for sepsis. Therefore, in this work, we propose a reinforcement learning method for learning the optimal treatment strategies for septic patients. We model the patient physiological time series data as the input for a deep recurrent Q-network that learns reliable treatment policies. We evaluate our model using an off-policy evaluation method, and the experimental results indicate that it outperforms the physicians' policy, reducing patient mortality up to 3.04%. Thus, our model can be used as a tool to reduce patient mortality by supporting clinicians in making dynamic decisions.
https://doi.org/10.30693/SMJ.2021.10.2.48 인용 PDF KSCI

Energy-Efficient DNN Processor on Embedded Systems for Spontaneous Human-Robot Interaction

Kim, Changhyeon;Yoo, Hoi-Jun
- Journal of Semiconductor Engineering
- /
- v.2 no.2
- /
- pp.130-135
- /
- 2021
Recently, deep neural networks (DNNs) are actively used for action control so that an autonomous system, such as the robot, can perform human-like behaviors and operations. Unlike recognition tasks, the real-time operation is essential in action control, and it is too slow to use remote learning on a server communicating through a network. New learning techniques, such as reinforcement learning (RL), are needed to determine and select the correct robot behavior locally. In this paper, we propose an energy-efficient DNN processor with a LUT-based processing engine and near-zero skipper. A CNN-based facial emotion recognition and an RNN-based emotional dialogue generation model is integrated for natural HRI system and tested with the proposed processor. It supports 1b to 16b variable weight bit precision with and 57.6% and 28.5% lower energy consumption than conventional MAC arithmetic units for 1b and 16b weight precision. Also, the near-zero skipper reduces 36% of MAC operation and consumes 28% lower energy consumption for facial emotion recognition tasks. Implemented in 65nm CMOS process, the proposed processor occupies 1784×1784 um2 areas and dissipates 0.28 mW and 34.4 mW at 1fps and 30fps facial emotion recognition tasks.
https://doi.org/10.22895/jse.2021.0001 인용 PDF KSCI

Power Trading System through the Prediction of Demand and Supply in Distributed Power System Based on Deep Reinforcement Learning (심층강화학습 기반 분산형 전력 시스템에서의 수요와 공급 예측을 통한 전력 거래시스템)

Lee, Seongwoo;Seon, Joonho;Kim, Soo-Hyun;Kim, Jin-Young
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.21 no.6
- /
- pp.163-171
- /
- 2021
In this paper, the energy transaction system was optimized by applying a resource allocation algorithm and deep reinforcement learning in the distributed power system. The power demand and supply environment were predicted by deep reinforcement learning. We propose a system that pursues common interests in power trading and increases the efficiency of long-term power transactions in the paradigm shift from conventional centralized to distributed power systems in the power trading system. For a realistic energy simulation model and environment, we construct the energy market by learning weather and monthly patterns adding Gaussian noise. In simulation results, we confirm that the proposed power trading systems are cooperative with each other, seek common interests, and increase profits in the prolonged energy transaction.
https://doi.org/10.7236/JIIBC.2021.21.6.163 인용 PDF KSCI HTML

Performance Evaluation of Reinforcement Learning Algorithm for Control of Smart TMD (스마트 TMD 제어를 위한 강화학습 알고리즘 성능 검토)

Kang, Joo-Won;Kim, Hyun-Su
- Journal of Korean Association for Spatial Structures
- /
- v.21 no.2
- /
- pp.41-48
- /
- 2021
A smart tuned mass damper (TMD) is widely studied for seismic response reduction of various structures. Control algorithm is the most important factor for control performance of a smart TMD. This study used a Deep Deterministic Policy Gradient (DDPG) among reinforcement learning techniques to develop a control algorithm for a smart TMD. A magnetorheological (MR) damper was used to make the smart TMD. A single mass model with the smart TMD was employed to make a reinforcement learning environment. Time history analysis simulations of the example structure subject to artificial seismic load were performed in the reinforcement learning process. Critic of policy network and actor of value network for DDPG agent were constructed. The action of DDPG agent was selected as the command voltage sent to the MR damper. Reward for the DDPG action was calculated by using displacement and velocity responses of the main mass. Groundhook control algorithm was used as a comparative control algorithm. After 10,000 episode training of the DDPG agent model with proper hyper-parameters, the semi-active control algorithm for control of seismic responses of the example structure with the smart TMD was developed. The simulation results presented that the developed DDPG model can provide effective control algorithms for smart TMD for reduction of seismic responses.
https://doi.org/10.9712/KASS.2021.21.2.41 인용 PDF KSCI

Motion Generation of a Single Rigid Body Character Using Deep Reinforcement Learning (심층 강화 학습을 활용한 단일 강체 캐릭터의 모션 생성)

Ahn, Jewon;Gu, Taehong;Kwon, Taesoo
- Journal of the Korea Computer Graphics Society
- /
- v.27 no.3
- /
- pp.13-23
- /
- 2021
In this paper, we proposed a framework that generates the trajectory of a single rigid body based on its COM configuration and contact pose. Because we use a smaller input dimension than when we use a full body state, we can improve the learning time for reinforcement learning. Even with a 68% reduction in learning time (approximately two hours), the character trained by our network is more robust to external perturbations tolerating an external force of 1500 N which is about 7.5 times larger than the maximum magnitude from a previous approach. For this framework, we use centroidal dynamics to calculate the next configuration of the COM, and use reinforcement learning for obtaining a policy that gives us parameters for controlling the contact positions and forces.
https://doi.org/10.15701/kcgs.2021.27.3.13 인용 PDF KSCI

Analysis of trends in deep learning and reinforcement learning

Dong-In Choi;Chungsoo Lim
- Journal of the Korea Society of Computer and Information
- /
- v.28 no.10
- /
- pp.55-65
- /
- 2023
In this paper, we apply KeyBERT(Keyword extraction with Bidirectional Encoder Representations of Transformers) algorithm-driven topic extraction and topic frequency analysis to deep learning and reinforcement learning research to discover the rapidly changing trends in them. First, we crawled abstracts of research papers on deep learning and reinforcement learning, and temporally divided them into two groups. After pre-processing the crawled data, we extracted topics using KeyBERT algorithm, and then analyzed the extracted topics in terms of topic occurrence frequency. This analysis reveals that there are distinct trends in research work of all analyzed algorithms and applications, and we can clearly tell which topics are gaining more interest. The analysis also proves the effectiveness of the utilized topic extraction and topic frequency analysis in research trend analysis, and this trend analysis scheme is expected to be used for research trend analysis in other research fields. In addition, the analysis can provide insight into how deep learning will evolve in the near future, and provide guidance for select research topics and methodologies by informing researchers of research topics and methodologies which are recently attracting attention.
https://doi.org/10.9708/jksci.2023.28.10.055 인용 PDF HTML

A Routing Algorithm based on Deep Reinforcement Learning in SDN (SDN에서 심층강화학습 기반 라우팅 알고리즘)

Lee, Sung-Keun
- The Journal of the Korea institute of electronic communication sciences
- /
- v.16 no.6
- /
- pp.1153-1160
- /
- 2021
This paper proposes a routing algorithm that determines the optimal path using deep reinforcement learning in software-defined networks. The deep reinforcement learning model for learning is based on DQN, the inputs are the current network state, source, and destination nodes, and the output returns a list of routes from source to destination. The routing task is defined as a discrete control problem, and the quality of service parameters for routing consider delay, bandwidth, and loss rate. The routing agent classifies the appropriate service class according to the user's quality of service profile, and converts the service class that can be provided for each link from the current network state collected from the SDN. Based on this converted information, it learns to select a route that satisfies the required service level from the source to the destination. The simulation results indicated that if the proposed algorithm proceeds with a certain episode, the correct path is selected and the learning is successfully performed.
https://doi.org/10.13067/JKIECS.2021.16.6.1153 인용 PDF KSCI

Research Trends on Deep Reinforcement Learning (심층 강화학습 기술 동향)

Jang, S.Y.;Yoon, H.J.;Park, N.S.;Yun, J.K.;Son, Y.S.
- Electronics and Telecommunications Trends
- /
- v.34 no.4
- /
- pp.1-14
- /
- 2019
Recent trends in deep reinforcement learning (DRL) have revealed the considerable improvements to DRL algorithms in terms of performance, learning stability, and computational efficiency. DRL also enables the scenarios that it covers (e.g., partial observability; cooperation, competition, coexistence, and communications among multiple agents; multi-task; decentralized intelligence) to be vastly expanded. These features have cultivated multi-agent reinforcement learning research. DRL is also expanding its applications from robotics to natural language processing and computer vision into a wide array of fields such as finance, healthcare, chemistry, and even art. In this report, we briefly summarize various DRL techniques and research directions.
https://doi.org/10.22648/ETRI.2019.J.340401 인용 PDF

Reward Design of Reinforcement Learning for Development of Smart Control Algorithm (스마트 제어알고리즘 개발을 위한 강화학습 리워드 설계)

Kim, Hyun-Su;Yoon, Ki-Yong
- Journal of Korean Association for Spatial Structures
- /
- v.22 no.2
- /
- pp.39-46
- /
- 2022
Recently, machine learning is widely used to solve optimization problems in various engineering fields. In this study, machine learning is applied to development of a control algorithm for a smart control device for reduction of seismic responses. For this purpose, Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm. A single degree of freedom (SDOF) structure with a smart tuned mass damper (TMD) was used as an example structure. A smart TMD system was composed of MR (magnetorheological) damper instead of passive damper. Reward design of reinforcement learning mainly affects the control performance of the smart TMD. Various hyper-parameters were investigated to optimize the control performance of DQN-based control algorithm. Usually, decrease of the time step for numerical simulation is desirable to increase the accuracy of simulation results. However, the numerical simulation results presented that decrease of the time step for reward calculation might decrease the control performance of DQN-based control algorithm. Therefore, a proper time step for reward calculation should be selected in a DQN training process.
https://doi.org/10.9712/KASS.2022.22.2.39 인용 PDF KSCI

Search Result 208, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)