• Title/Summary/Keyword: Deep Reinforcement Learning

Search Result 205, Processing Time 0.027 seconds

Application of Deep Recurrent Q Network with Dueling Architecture for Optimal Sepsis Treatment Policy

  • Do, Thanh-Cong;Yang, Hyung Jeong;Ho, Ngoc-Huynh
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.48-54
    • /
    • 2021
  • Sepsis is one of the leading causes of mortality globally, and it costs billions of dollars annually. However, treating septic patients is currently highly challenging, and more research is needed into a general treatment method for sepsis. Therefore, in this work, we propose a reinforcement learning method for learning the optimal treatment strategies for septic patients. We model the patient physiological time series data as the input for a deep recurrent Q-network that learns reliable treatment policies. We evaluate our model using an off-policy evaluation method, and the experimental results indicate that it outperforms the physicians' policy, reducing patient mortality up to 3.04%. Thus, our model can be used as a tool to reduce patient mortality by supporting clinicians in making dynamic decisions.

Energy-Efficient DNN Processor on Embedded Systems for Spontaneous Human-Robot Interaction

  • Kim, Changhyeon;Yoo, Hoi-Jun
    • Journal of Semiconductor Engineering
    • /
    • v.2 no.2
    • /
    • pp.130-135
    • /
    • 2021
  • Recently, deep neural networks (DNNs) are actively used for action control so that an autonomous system, such as the robot, can perform human-like behaviors and operations. Unlike recognition tasks, the real-time operation is essential in action control, and it is too slow to use remote learning on a server communicating through a network. New learning techniques, such as reinforcement learning (RL), are needed to determine and select the correct robot behavior locally. In this paper, we propose an energy-efficient DNN processor with a LUT-based processing engine and near-zero skipper. A CNN-based facial emotion recognition and an RNN-based emotional dialogue generation model is integrated for natural HRI system and tested with the proposed processor. It supports 1b to 16b variable weight bit precision with and 57.6% and 28.5% lower energy consumption than conventional MAC arithmetic units for 1b and 16b weight precision. Also, the near-zero skipper reduces 36% of MAC operation and consumes 28% lower energy consumption for facial emotion recognition tasks. Implemented in 65nm CMOS process, the proposed processor occupies 1784×1784 um2 areas and dissipates 0.28 mW and 34.4 mW at 1fps and 30fps facial emotion recognition tasks.

Power Trading System through the Prediction of Demand and Supply in Distributed Power System Based on Deep Reinforcement Learning (심층강화학습 기반 분산형 전력 시스템에서의 수요와 공급 예측을 통한 전력 거래시스템)

  • Lee, Seongwoo;Seon, Joonho;Kim, Soo-Hyun;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.6
    • /
    • pp.163-171
    • /
    • 2021
  • In this paper, the energy transaction system was optimized by applying a resource allocation algorithm and deep reinforcement learning in the distributed power system. The power demand and supply environment were predicted by deep reinforcement learning. We propose a system that pursues common interests in power trading and increases the efficiency of long-term power transactions in the paradigm shift from conventional centralized to distributed power systems in the power trading system. For a realistic energy simulation model and environment, we construct the energy market by learning weather and monthly patterns adding Gaussian noise. In simulation results, we confirm that the proposed power trading systems are cooperative with each other, seek common interests, and increase profits in the prolonged energy transaction.

Performance Evaluation of Reinforcement Learning Algorithm for Control of Smart TMD (스마트 TMD 제어를 위한 강화학습 알고리즘 성능 검토)

  • Kang, Joo-Won;Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.2
    • /
    • pp.41-48
    • /
    • 2021
  • A smart tuned mass damper (TMD) is widely studied for seismic response reduction of various structures. Control algorithm is the most important factor for control performance of a smart TMD. This study used a Deep Deterministic Policy Gradient (DDPG) among reinforcement learning techniques to develop a control algorithm for a smart TMD. A magnetorheological (MR) damper was used to make the smart TMD. A single mass model with the smart TMD was employed to make a reinforcement learning environment. Time history analysis simulations of the example structure subject to artificial seismic load were performed in the reinforcement learning process. Critic of policy network and actor of value network for DDPG agent were constructed. The action of DDPG agent was selected as the command voltage sent to the MR damper. Reward for the DDPG action was calculated by using displacement and velocity responses of the main mass. Groundhook control algorithm was used as a comparative control algorithm. After 10,000 episode training of the DDPG agent model with proper hyper-parameters, the semi-active control algorithm for control of seismic responses of the example structure with the smart TMD was developed. The simulation results presented that the developed DDPG model can provide effective control algorithms for smart TMD for reduction of seismic responses.

Motion Generation of a Single Rigid Body Character Using Deep Reinforcement Learning (심층 강화 학습을 활용한 단일 강체 캐릭터의 모션 생성)

  • Ahn, Jewon;Gu, Taehong;Kwon, Taesoo
    • Journal of the Korea Computer Graphics Society
    • /
    • v.27 no.3
    • /
    • pp.13-23
    • /
    • 2021
  • In this paper, we proposed a framework that generates the trajectory of a single rigid body based on its COM configuration and contact pose. Because we use a smaller input dimension than when we use a full body state, we can improve the learning time for reinforcement learning. Even with a 68% reduction in learning time (approximately two hours), the character trained by our network is more robust to external perturbations tolerating an external force of 1500 N which is about 7.5 times larger than the maximum magnitude from a previous approach. For this framework, we use centroidal dynamics to calculate the next configuration of the COM, and use reinforcement learning for obtaining a policy that gives us parameters for controlling the contact positions and forces.

Analysis of trends in deep learning and reinforcement learning

  • Dong-In Choi;Chungsoo Lim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.55-65
    • /
    • 2023
  • In this paper, we apply KeyBERT(Keyword extraction with Bidirectional Encoder Representations of Transformers) algorithm-driven topic extraction and topic frequency analysis to deep learning and reinforcement learning research to discover the rapidly changing trends in them. First, we crawled abstracts of research papers on deep learning and reinforcement learning, and temporally divided them into two groups. After pre-processing the crawled data, we extracted topics using KeyBERT algorithm, and then analyzed the extracted topics in terms of topic occurrence frequency. This analysis reveals that there are distinct trends in research work of all analyzed algorithms and applications, and we can clearly tell which topics are gaining more interest. The analysis also proves the effectiveness of the utilized topic extraction and topic frequency analysis in research trend analysis, and this trend analysis scheme is expected to be used for research trend analysis in other research fields. In addition, the analysis can provide insight into how deep learning will evolve in the near future, and provide guidance for select research topics and methodologies by informing researchers of research topics and methodologies which are recently attracting attention.

A Routing Algorithm based on Deep Reinforcement Learning in SDN (SDN에서 심층강화학습 기반 라우팅 알고리즘)

  • Lee, Sung-Keun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1153-1160
    • /
    • 2021
  • This paper proposes a routing algorithm that determines the optimal path using deep reinforcement learning in software-defined networks. The deep reinforcement learning model for learning is based on DQN, the inputs are the current network state, source, and destination nodes, and the output returns a list of routes from source to destination. The routing task is defined as a discrete control problem, and the quality of service parameters for routing consider delay, bandwidth, and loss rate. The routing agent classifies the appropriate service class according to the user's quality of service profile, and converts the service class that can be provided for each link from the current network state collected from the SDN. Based on this converted information, it learns to select a route that satisfies the required service level from the source to the destination. The simulation results indicated that if the proposed algorithm proceeds with a certain episode, the correct path is selected and the learning is successfully performed.

Research Trends on Deep Reinforcement Learning (심층 강화학습 기술 동향)

  • Jang, S.Y.;Yoon, H.J.;Park, N.S.;Yun, J.K.;Son, Y.S.
    • Electronics and Telecommunications Trends
    • /
    • v.34 no.4
    • /
    • pp.1-14
    • /
    • 2019
  • Recent trends in deep reinforcement learning (DRL) have revealed the considerable improvements to DRL algorithms in terms of performance, learning stability, and computational efficiency. DRL also enables the scenarios that it covers (e.g., partial observability; cooperation, competition, coexistence, and communications among multiple agents; multi-task; decentralized intelligence) to be vastly expanded. These features have cultivated multi-agent reinforcement learning research. DRL is also expanding its applications from robotics to natural language processing and computer vision into a wide array of fields such as finance, healthcare, chemistry, and even art. In this report, we briefly summarize various DRL techniques and research directions.

Reward Design of Reinforcement Learning for Development of Smart Control Algorithm (스마트 제어알고리즘 개발을 위한 강화학습 리워드 설계)

  • Kim, Hyun-Su;Yoon, Ki-Yong
    • Journal of Korean Association for Spatial Structures
    • /
    • v.22 no.2
    • /
    • pp.39-46
    • /
    • 2022
  • Recently, machine learning is widely used to solve optimization problems in various engineering fields. In this study, machine learning is applied to development of a control algorithm for a smart control device for reduction of seismic responses. For this purpose, Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm. A single degree of freedom (SDOF) structure with a smart tuned mass damper (TMD) was used as an example structure. A smart TMD system was composed of MR (magnetorheological) damper instead of passive damper. Reward design of reinforcement learning mainly affects the control performance of the smart TMD. Various hyper-parameters were investigated to optimize the control performance of DQN-based control algorithm. Usually, decrease of the time step for numerical simulation is desirable to increase the accuracy of simulation results. However, the numerical simulation results presented that decrease of the time step for reward calculation might decrease the control performance of DQN-based control algorithm. Therefore, a proper time step for reward calculation should be selected in a DQN training process.

A Distributed Scheduling Algorithm based on Deep Reinforcement Learning for Device-to-Device communication networks (단말간 직접 통신 네트워크를 위한 심층 강화학습 기반 분산적 스케쥴링 알고리즘)

  • Jeong, Moo-Woong;Kim, Lyun Woo;Ban, Tae-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.11
    • /
    • pp.1500-1506
    • /
    • 2020
  • In this paper, we study a scheduling problem based on reinforcement learning for overlay device-to-device (D2D) communication networks. Even though various technologies for D2D communication networks using Q-learning, which is one of reinforcement learning models, have been studied, Q-learning causes a tremendous complexity as the number of states and actions increases. In order to solve this problem, D2D communication technologies based on Deep Q Network (DQN) have been studied. In this paper, we thus design a DQN model by considering the characteristics of wireless communication systems, and propose a distributed scheduling scheme based on the DQN model that can reduce feedback and signaling overhead. The proposed model trains all parameters in a centralized manner, and transfers the final trained parameters to all mobiles. All mobiles individually determine their actions by using the transferred parameters. We analyze the performance of the proposed scheme by computer simulation and compare it with optimal scheme, opportunistic selection scheme and full transmission scheme.