• Title/Summary/Keyword: Deep Q Network

Search Result 65, Processing Time 0.025 seconds

Performance Comparison of Deep Reinforcement Learning based Computation Offloading in MEC (MEC 환경에서 심층 강화학습을 이용한 오프로딩 기법의 성능비교)

  • Moon, Sungwon;Lim, Yujin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.52-55
    • /
    • 2022
  • 5G 시대에 스마트 모바일 기기가 기하급수적으로 증가하면서 멀티 액세스 엣지 컴퓨팅(MEC)이 유망한 기술로 부상했다. 낮은 지연시간 안에 계산 집약적인 서비스를 제공하기 위해 MEC 서버로 오프로딩하는 특히, 태스크 도착률과 무선 채널의 상태가 확률적인 MEC 시스템 환경에서의 오프로딩 연구가 주목받고 있다. 본 논문에서는 차량의 전력과 지연시간을 최소화하기 위해 로컬 실행을 위한 연산 자원과 오프로딩을 위한 전송 전력을 할당하는 심층 강화학습 기반의 오프로딩 기법을 제안하였다. Deep Deterministic Policy Gradient (DDPG) 기반 기법과 Deep Q-network (DQN) 기반 기법을 차량의 전력 소비량과 큐잉 지연시간 측면에서 성능을 비교 분석하였다.

Methodology for Apartment Space Arrangement Based on Deep Reinforcement Learning

  • Cheng Yun Chi;Se Won Lee
    • Architectural research
    • /
    • v.26 no.1
    • /
    • pp.1-12
    • /
    • 2024
  • This study introduces a deep reinforcement learning (DRL)-based methodology for optimizing apartment space arrangements, addressing the limitations of human capability in evaluating all potential spatial configurations. Leveraging computational power, the methodology facilitates the autonomous exploration and evaluation of innovative layout options, considering architectural principles, legal standards, and client re-quirements. Through comprehensive simulation tests across various apartment types, the research demonstrates the DRL approach's effec-tiveness in generating efficient spatial arrangements that align with current design trends and meet predefined performance objectives. The comparative analysis of AI-generated layouts with those designed by professionals validates the methodology's applicability and potential in enhancing architectural design practices by offering novel, optimized spatial configuration solutions.

A Study on the Improvement of Heat Energy Efficiency for Utilities of Heat Consumer Plants based on Reinforcement Learning (강화학습을 기반으로 하는 열사용자 기계실 설비의 열효율 향상에 대한 연구)

  • Kim, Young-Gon;Heo, Keol;You, Ga-Eun;Lim, Hyun-Seo;Choi, Jung-In;Ku, Ki-Dong;Eom, Jae-Sik;Jeon, Young-Shin
    • Journal of Energy Engineering
    • /
    • v.27 no.2
    • /
    • pp.26-31
    • /
    • 2018
  • This paper introduces a study to improve the thermal efficiency of the district heating user control facility based on reinforcement learning. As an example, it is proposed a general method of constructing a deep Q learning network(DQN) using deep Q learning, which is a reinforcement learning algorithm that does not specify a model. In addition, it is also introduced the big data platform system and the integrated heat management system which are specialized in energy field applied in processing huge amount of data processing from IoT sensor installed in many thermal energy control facilities.

Study on Q-value prediction ahead of tunnel excavation face using recurrent neural network (순환인공신경망을 활용한 터널굴착면 전방 Q값 예측에 관한 연구)

  • Hong, Chang-Ho;Kim, Jin;Ryu, Hee-Hwan;Cho, Gye-Chun
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.22 no.3
    • /
    • pp.239-248
    • /
    • 2020
  • Exact rock classification helps suitable support patterns to be installed. Face mapping is usually conducted to classify the rock mass using RMR (Rock Mass Ration) or Q values. There have been several attempts to predict the grade of rock mass using mechanical data of jumbo drills or probe drills and photographs of excavation surfaces by using deep learning. However, they took long time, or had a limitation that it is impossible to grasp the rock grade in ahead of the tunnel surface. In this study, a method to predict the Q value ahead of excavation surface is developed using recurrent neural network (RNN) technique and it is compared with the Q values from face mapping for verification. Among Q values from over 4,600 tunnel faces, 70% of data was used for learning, and the rests were used for verification. Repeated learnings were performed in different number of learning and number of previous excavation surfaces utilized for learning. The coincidence between the predicted and actual Q values was compared with the root mean square error (RMSE). RMSE value from 600 times repeated learning with 2 prior excavation faces gives a lowest values. The results from this study can vary with the input data sets, the results can help to understand how the past ground conditions affect the future ground conditions and to predict the Q value ahead of the tunnel excavation face.

Mapless Navigation Based on DQN Considering Moving Obstacles, and Training Time Reduction Algorithm (이동 장애물을 고려한 DQN 기반의 Mapless Navigation 및 학습 시간 단축 알고리즘)

  • Yoon, Beomjin;Yoo, Seungryeol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.3
    • /
    • pp.377-383
    • /
    • 2021
  • Recently, in accordance with the 4th industrial revolution, The use of autonomous mobile robots for flexible logistics transfer is increasing in factories, the warehouses and the service areas, etc. In large factories, many manual work is required to use Simultaneous Localization and Mapping(SLAM), so the need for the improved mobile robot autonomous driving is emerging. Accordingly, in this paper, an algorithm for mapless navigation that travels in an optimal path avoiding fixed or moving obstacles is proposed. For mapless navigation, the robot is trained to avoid fixed or moving obstacles through Deep Q Network (DQN) and accuracy 90% and 93% are obtained for two types of obstacle avoidance, respectively. In addition, DQN requires a lot of learning time to meet the required performance before use. To shorten this, the target size change algorithm is proposed and confirmed the reduced learning time and performance of obstacle avoidance through simulation.

Analytical Study for Typology of Signage Market : by applying Q methodology (사이니지 시장유형화를 위한 해석적 연구 : Q 방법론 적용)

  • Kim, Hang Sub;Lee, Bong Gyou
    • Journal of Internet Computing and Services
    • /
    • v.17 no.2
    • /
    • pp.67-76
    • /
    • 2016
  • Defining of signage market types is an issue of great interest for both service providers and consumers. Through an interview with an expert with a deep understanding about digital signage in early-media stage, and based on analysis results, this study induced three types of signage market types. Each signage market type can be explained by their similar thoughts, opinions, concepts, and behaviors, but are dependent on the differences in a user experiences and knowledge. The study named the interactive signage market as Type 1, the network signage market as Type 2, and the signage-coupled-with-other-media market as Type 3.

Machine Scheduling Models Based on Reinforcement Learning for Minimizing Due Date Violation and Setup Change (납기 위반 및 셋업 최소화를 위한 강화학습 기반의 설비 일정계획 모델)

  • Yoo, Woosik;Seo, Juhyeok;Kim, Dahee;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.3
    • /
    • pp.19-33
    • /
    • 2019
  • Recently, manufacturers have been struggling to efficiently use production equipment as their production methods become more sophisticated and complex. Typical factors hindering the efficiency of the manufacturing process include setup cost due to job change. Especially, in the process of using expensive production equipment such as semiconductor / LCD process, efficient use of equipment is very important. Balancing the tradeoff between meeting the deadline and minimizing setup cost incurred by changes of work type is crucial planning task. In this study, we developed a scheduling model to achieve the goal of minimizing the duedate and setup costs by using reinforcement learning in parallel machines with duedate and work preparation costs. The proposed model is a Deep Q-Network (DQN) scheduling model and is a reinforcement learning-based model. To validate the effectiveness of our proposed model, we compared it against the heuristic model and DNN(deep neural network) based model. It was confirmed that our proposed DQN method causes less due date violation and setup costs than the benchmark methods.

Deep Reinforcement Learning-Based Cooperative Robot Using Facial Feedback (표정 피드백을 이용한 딥강화학습 기반 협력로봇 개발)

  • Jeon, Haein;Kang, Jeonghun;Kang, Bo-Yeong
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.264-272
    • /
    • 2022
  • Human-robot cooperative tasks are increasingly required in our daily life with the development of robotics and artificial intelligence technology. Interactive reinforcement learning strategies suggest that robots learn task by receiving feedback from an experienced human trainer during a training process. However, most of the previous studies on Interactive reinforcement learning have required an extra feedback input device such as a mouse or keyboard in addition to robot itself, and the scenario where a robot can interactively learn a task with human have been also limited to virtual environment. To solve these limitations, this paper studies training strategies of robot that learn table balancing tasks interactively using deep reinforcement learning with human's facial expression feedback. In the proposed system, the robot learns a cooperative table balancing task using Deep Q-Network (DQN), which is a deep reinforcement learning technique, with human facial emotion expression feedback. As a result of the experiment, the proposed system achieved a high optimal policy convergence rate of up to 83.3% in training and successful assumption rate of up to 91.6% in testing, showing improved performance compared to the model without human facial expression feedback.

Deep Reinforcement Learning based Tourism Experience Path Finding

  • Kyung-Hee Park;Juntae Kim
    • Journal of Platform Technology
    • /
    • v.11 no.6
    • /
    • pp.21-27
    • /
    • 2023
  • In this paper, we introduce a reinforcement learning-based algorithm for personalized tourist path recommendations. The algorithm employs a reinforcement learning agent to explore tourist regions and identify optimal paths that are expected to enhance tourism experiences. The concept of tourism experience is defined through points of interest (POI) located along tourist paths within the tourist area. These metrics are quantified through aggregated evaluation scores derived from reviews submitted by past visitors. In the experimental setup, the foundational learning model used to find tour paths is the Deep Q-Network (DQN). Despite the limited availability of historical tourist behavior data, the agent adeptly learns travel paths by incorporating preference scores of tourist POIs and spatial information of the travel area.

  • PDF

The Development of an Intelligent Home Energy Management System Integrated with a Vehicle-to-Home Unit using a Reinforcement Learning Approach

  • Ohoud Almughram;Sami Ben Slama;Bassam Zafar
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.4
    • /
    • pp.87-106
    • /
    • 2024
  • Vehicle-to-Home (V2H) and Home Centralized Photovoltaic (HCPV) systems can address various energy storage issues and enhance demand response programs. Renewable energy, such as solar energy and wind turbines, address the energy gap. However, no energy management system is currently available to regulate the uncertainty of renewable energy sources, electric vehicles, and appliance consumption within a smart microgrid. Therefore, this study investigated the impact of solar photovoltaic (PV) panels, electric vehicles, and Micro-Grid (MG) storage on maximum solar radiation hours. Several Deep Learning (DL) algorithms were applied to account for the uncertainty. Moreover, a Reinforcement Learning HCPV (RL-HCPV) algorithm was created for efficient real-time energy scheduling decisions. The proposed algorithm managed the energy demand between PV solar energy generation and vehicle energy storage. RL-HCPV was modeled according to several constraints to meet household electricity demands in sunny and cloudy weather. Simulations demonstrated how the proposed RL-HCPV system could efficiently handle the demand response and how V2H can help to smooth the appliance load profile and reduce power consumption costs with sustainable power generation. The results demonstrated the advantages of utilizing RL and V2H as potential storage technology for smart buildings.