• 제목/요약/키워드: Deep Reinforcement Learning

검색결과 207건 처리시간 0.034초

Deep Reinforcement Learning based Tourism Experience Path Finding

  • Kyung-Hee Park;Juntae Kim
    • Journal of Platform Technology
    • /
    • 제11권6호
    • /
    • pp.21-27
    • /
    • 2023
  • In this paper, we introduce a reinforcement learning-based algorithm for personalized tourist path recommendations. The algorithm employs a reinforcement learning agent to explore tourist regions and identify optimal paths that are expected to enhance tourism experiences. The concept of tourism experience is defined through points of interest (POI) located along tourist paths within the tourist area. These metrics are quantified through aggregated evaluation scores derived from reviews submitted by past visitors. In the experimental setup, the foundational learning model used to find tour paths is the Deep Q-Network (DQN). Despite the limited availability of historical tourist behavior data, the agent adeptly learns travel paths by incorporating preference scores of tourist POIs and spatial information of the travel area.

  • PDF

관계형 강화 학습을 위한 도메인 지식의 효과적인 활용 (Effective Utilization of Domain Knowledge for Relational Reinforcement Learning)

  • 강민교;김인철
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제11권3호
    • /
    • pp.141-148
    • /
    • 2022
  • 최근 들어 강화 학습은 심층 신경망 기술과 결합되어 바둑, 체스와 같은 보드 게임, Atari, StartCraft와 같은 컴퓨터 게임, 로봇 물체 조작 작업 등과 같은 다양한 분야에서 매우 놀라운 성공을 거두었다. 하지만 이러한 심층 강화 학습은 행동, 상태, 정책 등을 모두 벡터 형태로 표현한다. 따라서 기존의 심층 강화 학습은 학습된 정책의 해석 가능성과 일반성에 제한이 있고, 도메인 지식을 학습에 효과적으로 활용하기도 어렵다는 한계성이 있다. 이러한 한계점들을 해결하기 위해 제안된 새로운 관계형 강화 학습 프레임워크인 dNL-RRL은 센서 입력 데이터와 행동 실행 제어는 기존의 심층 강화 학습과 마찬가지로 벡터 표현을 이용하지만, 행동, 상태, 그리고 학습된 정책은 모두 논리 서술자와 규칙들로 나타내는 관계형 표현을 이용한다. 본 논문에서는 dNL-RRL 관계형 강화 학습 프레임워크를 이용하여 제조 환경 내에서 운송용 모바일 로봇을 위한 행동 정책 학습을 수행하는 효과적인 방법을 제시한다. 특히 본 연구에서는 관계형 강화 학습의 효율성을 높이기 위해, 인간 전문가의 사전 도메인 지식을 활용하는 방안들을 제안한다. 여러 가지 실험들을 통해, 본 논문에서 제안하는 도메인 지식을 활용한 관계형 강화 학습 프레임워크의 성능 개선 효과를 입증한다.

Deep reinforcement learning for base station switching scheme with federated LSTM-based traffic predictions

  • Hyebin Park;Seung Hyun Yoon
    • ETRI Journal
    • /
    • 제46권3호
    • /
    • pp.379-391
    • /
    • 2024
  • To meet increasing traffic requirements in mobile networks, small base stations (SBSs) are densely deployed, overlapping existing network architecture and increasing system capacity. However, densely deployed SBSs increase energy consumption and interference. Although these problems already exist because of densely deployed SBSs, even more SBSs are needed to meet increasing traffic demands. Hence, base station (BS) switching operations have been used to minimize energy consumption while guaranteeing quality-of-service (QoS) for users. In this study, to optimize energy efficiency, we propose the use of deep reinforcement learning (DRL) to create a BS switching operation strategy with a traffic prediction model. First, a federated long short-term memory (LSTM) model is introduced to predict user traffic demands from user trajectory information. Next, the DRL-based BS switching operation scheme determines the switching operations for the SBSs using the predicted traffic demand. Experimental results confirm that the proposed scheme outperforms existing approaches in terms of energy efficiency, signal-to-interference noise ratio, handover metrics, and prediction performance.

작물 생산량 예측을 위한 심층강화학습 성능 분석 (Performance Analysis of Deep Reinforcement Learning for Crop Yield Prediction )

  • 옴마킨;이성근
    • 한국전자통신학회논문지
    • /
    • 제18권1호
    • /
    • pp.99-106
    • /
    • 2023
  • 최근 딥러닝 기술을 활용하여 작물 생산량 예측 연구가 많이 진행되고 있다. 딥러닝 알고리즘은 입력 데이터 세트와 작물 예측 결과에 대한 선형 맵을 구성하는데 어려움이 있다. 또한, 알고리즘 구현은 획득한 속성의 비율에 긍정적으로 의존한다. 심층강화학습을 작물 생산량 예측 응용에 적용한다면 이러한 한계점을 보완할 수 있다. 본 논문은 작물 생산량 예측을 개선하기 위해 DQN, Double DQN 및 Dueling DQN 의 성능을 분석한다. DQN 알고리즘은 과대 평가 문제가 제기되지만, Double DQN은 과대 평가를 줄이고 더 나은 결과를 얻을 수 있다. 본 논문에서 제안된 모델은 거짓 판정을 줄이고 예측 정확도를 높이는 것으로 나타났다.

자율주행 자동차의 주차를 위한 강화학습 활성화 함수 비교 분석 (A Comparative Analysis of Reinforcement Learning Activation Functions for Parking of Autonomous Vehicles)

  • 이동철
    • 한국인터넷방송통신학회논문지
    • /
    • 제22권6호
    • /
    • pp.75-81
    • /
    • 2022
  • 주차 공간의 부족함을 획기적으로 해결할 수 있는 자율주행 자동차는 심층 강화 학습을 통해 큰 발전을 이루고 있다. 심층 강화 학습에는 활성화 함수가 사용되는데, 그동안 다양한 활성화 함수가 제안되어 왔으나 적용 환경에 따라 그 성능 편차가 심했다. 따라서 환경에 따라 최적의 활성화 함수를 찾는 것이 효과적인 학습을 위해 중요하다. 본 논문은 자율주행 자동차가 주차를 학습하기 위해 심층 강화 학습을 사용할 때 어떤 활성화 함수를 사용하는 것이 가장 효과적인지 비교 평가하기 위해 강화 학습에 주로 사용되는 12개의 함수를 분석하였다. 이를 위해 성능 평가 환경을 구축하고 각 활성화 함수의 평균 보상을 성공률, 에피소드 길이, 자동차 속도와 비교하였다. 그 결과 가장 높은 보상은 GELU를 사용한 경우였고, ELU는 가장 낮았다. 두 활성화 함수의 보상 차이는 35.2%였다.

픽셀 데이터를 이용한 강화 학습 알고리즘 적용에 관한 연구 (A Study on Application of Reinforcement Learning Algorithm Using Pixel Data)

  • 문새마로;최용락
    • 한국IT서비스학회지
    • /
    • 제15권4호
    • /
    • pp.85-95
    • /
    • 2016
  • Recently, deep learning and machine learning have attracted considerable attention and many supporting frameworks appeared. In artificial intelligence field, a large body of research is underway to apply the relevant knowledge for complex problem-solving, necessitating the application of various learning algorithms and training methods to artificial intelligence systems. In addition, there is a dearth of performance evaluation of decision making agents. The decision making agent that can find optimal solutions by using reinforcement learning methods designed through this research can collect raw pixel data observed from dynamic environments and make decisions by itself based on the data. The decision making agent uses convolutional neural networks to classify situations it confronts, and the data observed from the environment undergoes preprocessing before being used. This research represents how the convolutional neural networks and the decision making agent are configured, analyzes learning performance through a value-based algorithm and a policy-based algorithm : a Deep Q-Networks and a Policy Gradient, sets forth their differences and demonstrates how the convolutional neural networks affect entire learning performance when using pixel data. This research is expected to contribute to the improvement of artificial intelligence systems which can efficiently find optimal solutions by using features extracted from raw pixel data.

Fault-tolerant control system for once-through steam generator based on reinforcement learning algorithm

  • Li, Cheng;Yu, Ren;Yu, Wenmin;Wang, Tianshu
    • Nuclear Engineering and Technology
    • /
    • 제54권9호
    • /
    • pp.3283-3292
    • /
    • 2022
  • Based on the Deep Q-Network(DQN) algorithm of reinforcement learning, an active fault-tolerance method with incremental action is proposed for the control system with sensor faults of the once-through steam generator(OTSG). In this paper, we first establish the OTSG model as the interaction environment for the agent of reinforcement learning. The reinforcement learning agent chooses an action according to the system state obtained by the pressure sensor, the incremental action can gradually approach the optimal strategy for the current fault, and then the agent updates the network by different rewards obtained in the interaction process. In this way, we can transform the active fault tolerant control process of the OTSG to the reinforcement learning agent's decision-making process. The comparison experiments compared with the traditional reinforcement learning algorithm(RL) with fixed strategies show that the active fault-tolerant controller designed in this paper can accurately and rapidly control under sensor faults so that the pressure of the OTSG can be stabilized near the set-point value, and the OTSG can run normally and stably.

MANET에서 종단간 통신지연 최소화를 위한 심층 강화학습 기반 분산 라우팅 알고리즘 (Deep Reinforcement Learning-based Distributed Routing Algorithm for Minimizing End-to-end Delay in MANET)

  • Choi, Yeong-Jun;Seo, Ju-Sung;Hong, Jun-Pyo
    • 한국정보통신학회논문지
    • /
    • 제25권9호
    • /
    • pp.1267-1270
    • /
    • 2021
  • In this paper, we propose a distributed routing algorithm for mobile ad hoc networks (MANET) where mobile devices can be utilized as relays for communication between remote source-destination nodes. The objective of the proposed algorithm is to minimize the end-to-end communication delay caused by transmission failure with deep channel fading. In each hop, the node needs to select the next relaying node by considering a tradeoff relationship between the link stability and forward link distance. Based on such feature, we formulate the problem with partially observable Markov decision process (MDP) and apply deep reinforcement learning to derive effective routing strategy for the formulated MDP. Simulation results show that the proposed algorithm outperforms other baseline schemes in terms of the average end-to-end delay.

Methodology for Apartment Space Arrangement Based on Deep Reinforcement Learning

  • Cheng Yun Chi;Se Won Lee
    • Architectural research
    • /
    • 제26권1호
    • /
    • pp.1-12
    • /
    • 2024
  • This study introduces a deep reinforcement learning (DRL)-based methodology for optimizing apartment space arrangements, addressing the limitations of human capability in evaluating all potential spatial configurations. Leveraging computational power, the methodology facilitates the autonomous exploration and evaluation of innovative layout options, considering architectural principles, legal standards, and client re-quirements. Through comprehensive simulation tests across various apartment types, the research demonstrates the DRL approach's effec-tiveness in generating efficient spatial arrangements that align with current design trends and meet predefined performance objectives. The comparative analysis of AI-generated layouts with those designed by professionals validates the methodology's applicability and potential in enhancing architectural design practices by offering novel, optimized spatial configuration solutions.

Application of Reinforcement Learning in Detecting Fraudulent Insurance Claims

  • Choi, Jung-Moon;Kim, Ji-Hyeok;Kim, Sung-Jun
    • International Journal of Computer Science & Network Security
    • /
    • 제21권9호
    • /
    • pp.125-131
    • /
    • 2021
  • Detecting fraudulent insurance claims is difficult due to small and unbalanced data. Some research has been carried out to better cope with various types of fraudulent claims. Nowadays, technology for detecting fraudulent insurance claims has been increasingly utilized in insurance and technology fields, thanks to the use of artificial intelligence (AI) methods in addition to traditional statistical detection and rule-based methods. This study obtained meaningful results for a fraudulent insurance claim detection model based on machine learning (ML) and deep learning (DL) technologies, using fraudulent insurance claim data from previous research. In our search for a method to enhance the detection of fraudulent insurance claims, we investigated the reinforcement learning (RL) method. We examined how we could apply the RL method to the detection of fraudulent insurance claims. There are limited previous cases of applying the RL method. Thus, we first had to define the RL essential elements based on previous research on detecting anomalies. We applied the deep Q-network (DQN) and double deep Q-network (DDQN) in the learning fraudulent insurance claim detection model. By doing so, we confirmed that our model demonstrated better performance than previous machine learning models.