• Title/Summary/Keyword: Deep-Q-Network

Search Result 63, Processing Time 0.029 seconds

Goal Oriented Dialogue System Based on Deep Recurrent Q Network (심층 순환 Q 네트워크 기반 목적 지향 대화 시스템)

  • Park, Geonwoo;Kim, Harksoo
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.147-150
    • /
    • 2018
  • 목적 지향 대화 시스템은 자연어 이해, 대화 관리자, 자연어 생성과 같은 세분화 모델들의 결합으로 이루어져있어 하위 모델에 대한 오류 전파에 취약하다. 이러한 문제점을 해결하기 위해 자연어 이해 모델과 대화 관리자를 하나의 네트워크로 구성하고 오류에 강건한 심층 Q 네트워크를 제안한다. 본 논문에서는 대화의 전체 흐름을 파악 할 수 있는 순환 신경망인 LSTM에 심층 Q 네트워크 적용한 심층 순환 Q 네트워크 기반 목적 지향 대화 시스템을 제안한다. 실험 결과, 제안한 심층 순환 Q 네트워크는 LSTM, 심층 Q 네트워크보다 각각 정밀도 1.0%p, 6.7%p 높은 성능을 보였다.

  • PDF

Optimal Design of Semi-Active Mid-Story Isolation System using Supervised Learning and Reinforcement Learning (지도학습과 강화학습을 이용한 준능동 중간층면진시스템의 최적설계)

  • Kang, Joo-Won;Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.4
    • /
    • pp.73-80
    • /
    • 2021
  • A mid-story isolation system was proposed for seismic response reduction of high-rise buildings and presented good control performance. Control performance of a mid-story isolation system was enhanced by introducing semi-active control devices into isolation systems. Seismic response reduction capacity of a semi-active mid-story isolation system mainly depends on effect of control algorithm. AI(Artificial Intelligence)-based control algorithm was developed for control of a semi-active mid-story isolation system in this study. For this research, an practical structure of Shiodome Sumitomo building in Japan which has a mid-story isolation system was used as an example structure. An MR (magnetorheological) damper was used to make a semi-active mid-story isolation system in example model. In numerical simulation, seismic response prediction model was generated by one of supervised learning model, i.e. an RNN (Recurrent Neural Network). Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm The numerical simulation results presented that the DQN algorithm can effectively control a semi-active mid-story isolation system resulting in successful reduction of seismic responses.

Recommendation System of University Major Subject based on Deep Reinforcement Learning (심층 강화학습 기반의 대학 전공과목 추천 시스템)

  • Ducsun Lim;Youn-A Min;Dongkyun Lim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.4
    • /
    • pp.9-15
    • /
    • 2023
  • Existing simple statistics-based recommendation systems rely solely on students' course enrollment history data, making it difficult to identify classes that match students' preferences. To address this issue, this study proposes a personalized major subject recommendation system based on deep reinforcement learning (DRL). This system gauges the similarity between students based on structured data, such as the student's department, grade level, and course history. Based on this information, it recommends the most suitable major subjects by comprehensively considering information about each available major subject and evaluations of the student's courses. We confirmed that this DRL-based recommendation system provides useful insights for university students while selecting their major subjects, and our simulation results indicate that it outperforms conventional statistics-based recommendation systems by approximately 20%. In light of these results, we propose a new system that offers personalized subject recommendations by incorporating students' course evaluations. This system is expected to assist students significantly in finding major subjects that align with their preferences and academic goals.

Resource Allocation Strategy of Internet of Vehicles Using Reinforcement Learning

  • Xi, Hongqi;Sun, Huijuan
    • Journal of Information Processing Systems
    • /
    • v.18 no.3
    • /
    • pp.443-456
    • /
    • 2022
  • An efficient and reasonable resource allocation strategy can greatly improve the service quality of Internet of Vehicles (IoV). However, most of the current allocation methods have overestimation problem, and it is difficult to provide high-performance IoV network services. To solve this problem, this paper proposes a network resource allocation strategy based on deep learning network model DDQN. Firstly, the method implements the refined modeling of IoV model, including communication model, user layer computing model, edge layer offloading model, mobile model, etc., similar to the actual complex IoV application scenario. Then, the DDQN network model is used to calculate and solve the mathematical model of resource allocation. By decoupling the selection of target Q value action and the calculation of target Q value, the phenomenon of overestimation is avoided. It can provide higher-quality network services and ensure superior computing and processing performance in actual complex scenarios. Finally, simulation results show that the proposed method can maintain the network delay within 65 ms and show excellent network performance in high concurrency and complex scenes with task data volume of 500 kbits.

Reward Design of Reinforcement Learning for Development of Smart Control Algorithm (스마트 제어알고리즘 개발을 위한 강화학습 리워드 설계)

  • Kim, Hyun-Su;Yoon, Ki-Yong
    • Journal of Korean Association for Spatial Structures
    • /
    • v.22 no.2
    • /
    • pp.39-46
    • /
    • 2022
  • Recently, machine learning is widely used to solve optimization problems in various engineering fields. In this study, machine learning is applied to development of a control algorithm for a smart control device for reduction of seismic responses. For this purpose, Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm. A single degree of freedom (SDOF) structure with a smart tuned mass damper (TMD) was used as an example structure. A smart TMD system was composed of MR (magnetorheological) damper instead of passive damper. Reward design of reinforcement learning mainly affects the control performance of the smart TMD. Various hyper-parameters were investigated to optimize the control performance of DQN-based control algorithm. Usually, decrease of the time step for numerical simulation is desirable to increase the accuracy of simulation results. However, the numerical simulation results presented that decrease of the time step for reward calculation might decrease the control performance of DQN-based control algorithm. Therefore, a proper time step for reward calculation should be selected in a DQN training process.

A Study on Automatic Comment Generation Using Deep Learning (딥 러닝을 이용한 자동 댓글 생성에 관한 연구)

  • Choi, Jae-yong;Sung, So-yun;Kim, Kyoung-chul
    • Journal of Korea Game Society
    • /
    • v.18 no.5
    • /
    • pp.83-92
    • /
    • 2018
  • Many studies in deep learning show results as good as human's decision in various fields. And importance of activation of online-community and SNS grows up in game industry. Even it decides whether a game can be successful or not. The purpose of this study is to construct a system which can read texts and create comments according to schedule in online-community and SNS using deep learning. Using recurrent neural network, we constructed models generating a comment and a schedule of writing comments, and made program choosing a news title and uploading the comment at twitter in calculated time automatically. This study can be applied to activating an online game community, a Q&A service, etc.

Deep Reinforcement Learning-Based Edge Caching in Heterogeneous Networks

  • Yoonjeong, Choi; Yujin, Lim
    • Journal of Information Processing Systems
    • /
    • v.18 no.6
    • /
    • pp.803-812
    • /
    • 2022
  • With the increasing number of mobile device users worldwide, utilizing mobile edge computing (MEC) devices close to users for content caching can reduce transmission latency than receiving content from a server or cloud. However, because MEC has limited storage capacity, it is necessary to determine the content types and sizes to be cached. In this study, we investigate a caching strategy that increases the hit ratio from small base stations (SBSs) for mobile users in a heterogeneous network consisting of one macro base station (MBS) and multiple SBSs. If there are several SBSs that users can access, the hit ratio can be improved by reducing duplicate content and increasing the diversity of content in SBSs. We propose a Deep Q-Network (DQN)-based caching strategy that considers time-varying content popularity and content redundancy in multiple SBSs. Content is stored in the SBS in a divided form using maximum distance separable (MDS) codes to enhance the diversity of the content. Experiments in various environments show that the proposed caching strategy outperforms the other methods in terms of hit ratio.

Development of Optimal Design Technique of RC Beam using Multi-Agent Reinforcement Learning (다중 에이전트 강화학습을 이용한 RC보 최적설계 기술개발)

  • Kang, Joo-Won;Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.23 no.2
    • /
    • pp.29-36
    • /
    • 2023
  • Reinforcement learning (RL) is widely applied to various engineering fields. Especially, RL has shown successful performance for control problems, such as vehicles, robotics, and active structural control system. However, little research on application of RL to optimal structural design has conducted to date. In this study, the possibility of application of RL to structural design of reinforced concrete (RC) beam was investigated. The example of RC beam structural design problem introduced in previous study was used for comparative study. Deep q-network (DQN) is a famous RL algorithm presenting good performance in the discrete action space and thus it was used in this study. The action of DQN agent is required to represent design variables of RC beam. However, the number of design variables of RC beam is too many to represent by the action of conventional DQN. To solve this problem, multi-agent DQN was used in this study. For more effective reinforcement learning process, DDQN (Double Q-Learning) that is an advanced version of a conventional DQN was employed. The multi-agent of DDQN was trained for optimal structural design of RC beam to satisfy American Concrete Institute (318) without any hand-labeled dataset. Five agents of DDQN provides actions for beam with, beam depth, main rebar size, number of main rebar, and shear stirrup size, respectively. Five agents of DDQN were trained for 10,000 episodes and the performance of the multi-agent of DDQN was evaluated with 100 test design cases. This study shows that the multi-agent DDQN algorithm can provide successfully structural design results of RC beam.

Tidy-up Task Planner based on Q-learning (정리정돈을 위한 Q-learning 기반의 작업계획기)

  • Yang, Min-Gyu;Ahn, Kuk-Hyun;Song, Jae-Bok
    • The Journal of Korea Robotics Society
    • /
    • v.16 no.1
    • /
    • pp.56-63
    • /
    • 2021
  • As the use of robots in service area increases, research has been conducted to replace human tasks in daily life with robots. Among them, this study focuses on the tidy-up task on a desk using a robot arm. The order in which tidy-up motions are carried out has a great impact on the success rate of the task. Therefore, in this study, a neural network-based method for determining the priority of the tidy-up motions from the input image is proposed. Reinforcement learning, which shows good performance in the sequential decision-making process, is used to train such a task planner. The training process is conducted in a virtual tidy-up environment that is configured the same as the actual tidy-up environment. To transfer the learning results in the virtual environment to the actual environment, the input image is preprocessed into a segmented image. In addition, the use of a neural network that excludes unnecessary tidy-up motions from the priority during the tidy-up operation increases the success rate of the task planner. Experiments were conducted in the real world to verify the proposed task planning method.

Performance Comparison of Deep Reinforcement Learning based Computation Offloading in MEC (MEC 환경에서 심층 강화학습을 이용한 오프로딩 기법의 성능비교)

  • Moon, Sungwon;Lim, Yujin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.52-55
    • /
    • 2022
  • 5G 시대에 스마트 모바일 기기가 기하급수적으로 증가하면서 멀티 액세스 엣지 컴퓨팅(MEC)이 유망한 기술로 부상했다. 낮은 지연시간 안에 계산 집약적인 서비스를 제공하기 위해 MEC 서버로 오프로딩하는 특히, 태스크 도착률과 무선 채널의 상태가 확률적인 MEC 시스템 환경에서의 오프로딩 연구가 주목받고 있다. 본 논문에서는 차량의 전력과 지연시간을 최소화하기 위해 로컬 실행을 위한 연산 자원과 오프로딩을 위한 전송 전력을 할당하는 심층 강화학습 기반의 오프로딩 기법을 제안하였다. Deep Deterministic Policy Gradient (DDPG) 기반 기법과 Deep Q-network (DQN) 기반 기법을 차량의 전력 소비량과 큐잉 지연시간 측면에서 성능을 비교 분석하였다.