Search | Korea Science

DQN Reinforcement Learning for Acrobot in OpenAI Gym Environment (OpenAI Gym 환경의 Acrobot에 대한 DQN 강화학습)

Myung-Ju Kang
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2023.07a
- /
- pp.35-36
- /
- 2023
본 논문에서는 OpenAI Gym 환경에서 제공하는 Acrobot-v1에 대해 DQN(Deep Q-Networks) 강화학습으로 학습시키고, 이 때 적용되는 활성화함수의 성능을 비교분석하였다. DQN 강화학습에 적용한 활성화함수는 ReLU, ReakyReLU, ELU, SELU 그리고 softplus 함수이다. 실험 결과 평균적으로 Leaky_ReLU 활성화함수를 적용했을 때의 보상 값이 높았고, 최대 보상 값은 SELU 활성화 함수를 적용할 때로 나타났다.
PDF

Applying Deep Reinforcement Learning to Improve Throughput and Reduce Collision Rate in IEEE 802.11 Networks

Ke, Chih-Heng;Astuti, Lia
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.16 no.1
- /
- pp.334-349
- /
- 2022
The effectiveness of Wi-Fi networks is greatly influenced by the optimization of contention window (CW) parameters. Unfortunately, the conventional approach employed by IEEE 802.11 wireless networks is not scalable enough to sustain consistent performance for the increasing number of stations. Yet, it is still the default when accessing channels for single-users of 802.11 transmissions. Recently, there has been a spike in attempts to enhance network performance using a machine learning (ML) technique known as reinforcement learning (RL). Its advantage is interacting with the surrounding environment and making decisions based on its own experience. Deep RL (DRL) uses deep neural networks (DNN) to deal with more complex environments (such as continuous state spaces or actions spaces) and to get optimum rewards. As a result, we present a new approach of CW control mechanism, which is termed as contention window threshold (CW_Threshold). It uses the DRL principle to define the threshold value and learn optimal settings under various network scenarios. We demonstrate our proposed method, known as a smart exponential-threshold-linear backoff algorithm with a deep Q-learning network (SETL-DQN). The simulation results show that our proposed SETL-DQN algorithm can effectively improve the throughput and reduce the collision rates.
https://doi.org/10.3837/tiis.2022.01.019 인용 PDF KSCI HTML

Visual Analysis of Deep Q-network

Seng, Dewen;Zhang, Jiaming;Shi, Xiaoying
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.3
- /
- pp.853-873
- /
- 2021
In recent years, deep reinforcement learning (DRL) models are enjoying great interest as their success in a variety of challenging tasks. Deep Q-Network (DQN) is a widely used deep reinforcement learning model, which trains an intelligent agent that executes optimal actions while interacting with an environment. This model is well known for its ability to surpass skilled human players across many Atari 2600 games. Although DQN has achieved excellent performance in practice, there lacks a clear understanding of why the model works. In this paper, we present a visual analytics system for understanding deep Q-network in a non-blind matter. Based on the stored data generated from the training and testing process, four coordinated views are designed to expose the internal execution mechanism of DQN from different perspectives. We report the system performance and demonstrate its effectiveness through two case studies. By using our system, users can learn the relationship between states and Q-values, the function of convolutional layers, the strategies learned by DQN and the rationality of decisions made by the agent.
https://doi.org/10.3837/tiis.2021.03.003 인용 PDF KSCI HTML

Performance Analysis of Deep Reinforcement Learning for Crop Yield Prediction (작물 생산량 예측을 위한 심층강화학습 성능 분석)

Ohnmar Khin;Sung-Keun Lee
- The Journal of the Korea institute of electronic communication sciences
- /
- v.18 no.1
- /
- pp.99-106
- /
- 2023
Recently, many studies on crop yield prediction using deep learning technology have been conducted. These algorithms have difficulty constructing a linear map between input data sets and crop prediction results. Furthermore, implementation of these algorithms positively depends on the rate of acquired attributes. Deep reinforcement learning can overcome these limitations. This paper analyzes the performance of DQN, Double DQN and Dueling DQN to improve crop yield prediction. The DQN algorithm retains the overestimation problem. Whereas, Double DQN declines the over-estimations and leads to getting better results. The proposed models achieves these by reducing the falsehood and increasing the prediction exactness.
https://doi.org/10.13067/JKIECS.2023.18.1.99 인용 PDF

A Distributed Scheduling Algorithm based on Deep Reinforcement Learning for Device-to-Device communication networks (단말간 직접 통신 네트워크를 위한 심층 강화학습 기반 분산적 스케쥴링 알고리즘)

Jeong, Moo-Woong;Kim, Lyun Woo;Ban, Tae-Won
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.24 no.11
- /
- pp.1500-1506
- /
- 2020
In this paper, we study a scheduling problem based on reinforcement learning for overlay device-to-device (D2D) communication networks. Even though various technologies for D2D communication networks using Q-learning, which is one of reinforcement learning models, have been studied, Q-learning causes a tremendous complexity as the number of states and actions increases. In order to solve this problem, D2D communication technologies based on Deep Q Network (DQN) have been studied. In this paper, we thus design a DQN model by considering the characteristics of wireless communication systems, and propose a distributed scheduling scheme based on the DQN model that can reduce feedback and signaling overhead. The proposed model trains all parameters in a centralized manner, and transfers the final trained parameters to all mobiles. All mobiles individually determine their actions by using the transferred parameters. We analyze the performance of the proposed scheme by computer simulation and compare it with optimal scheme, opportunistic selection scheme and full transmission scheme.
https://doi.org/10.6109/jkiice.2020.24.11.1500 인용 PDF KSCI

DQN Reinforcement Learning for Mountain-Car in OpenAI Gym Environment (OpenAI Gym 환경의 Mountain-Car에 대한 DQN 강화학습)

Myung-Ju Kang
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2024.01a
- /
- pp.375-377
- /
- 2024
본 논문에서는 OpenAI Gym 환경에서 프로그램으로 간단한 제어가 가능한 Mountain-Car-v0 게임에 대해 DQN(Deep Q-Networks) 강화학습을 진행하였다. 본 논문에서 적용한 DQN 네트워크는 입력층 1개, 은닉층 3개, 출력층 1개로 구성하였고, 입력층과 은닉층에서의 활성화함수는 ReLU를, 출력층에서는 Linear함수를 활성화함수로 적용하였다. 실험은 Mountain-Car-v0에 대해 DQN 강화학습을 진행했을 때 각 에피소드별로 획득한 보상 결과를 살펴보고, 보상구간에 포함된 횟수를 분석하였다. 실험결과 전체 100회의 에피소드 중 보상을 50 이상 획득한 에피소드가 85개로 나타났다.
PDF

Comparison of Activation Functions of Reinforcement Learning in OpenAI Gym Environments (OpenAI Gym 환경에서 강화학습의 활성화함수 비교 분석)

Myung-Ju Kang
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2023.01a
- /
- pp.25-26
- /
- 2023
본 논문에서는 OpenAI Gym 환경에서 제공하는 CartPole-v1에 대해 강화학습을 통해 에이전트를 학습시키고, 학습에 적용되는 활성화함수의 성능을 비교분석하였다. 본 논문에서 적용한 활성화함수는 Sigmoid, ReLU, ReakyReLU 그리고 softplus 함수이며, 각 활성화함수를 DQN(Deep Q-Networks) 강화학습에 적용했을 때 보상 값을 비교하였다. 실험결과 ReLU 활성화함수를 적용하였을 때의 보상이 가장 높은 것을 알 수 있었다.
PDF

A study on Deep Q-Networks based Auto-scaling in NFV Environment (NFV 환경에서의 Deep Q-Networks 기반 오토 스케일링 기술 연구)

Lee, Do-Young;Yoo, Jae-Hyoung;Hong, James Won-Ki
- KNOM Review
- /
- v.23 no.2
- /
- pp.1-10
- /
- 2020
Network Function Virtualization (NFV) is a key technology of 5G networks that has the advantage of enabling building and operating networks flexibly. However, NFV can complicate network management because it creates numerous virtual resources that should be managed. In NFV environments, service function chaining (SFC) composed of virtual network functions (VNFs) is widely used to apply a series of network functions to traffic. Therefore, it is required to dynamically allocate the right amount of computing resources or instances to SFC for meeting service requirements. In this paper, we propose Deep Q-Networks (DQN)-based auto-scaling to operate the appropriate number of VNF instances in SFC. The proposed approach not only resizes the number of VNF instances in SFC composed of multi-tier architecture but also selects a tier to be scaled in response to dynamic traffic forwarding through SFC.
https://doi.org/10.22670/knom.2020.23.2.1 인용

Real-Time Path Planning for Mobile Robots Using Q-Learning (Q-learning을 이용한 이동 로봇의 실시간 경로 계획)

Kim, Ho-Won;Lee, Won-Chang
- Journal of IKEEE
- /
- v.24 no.4
- /
- pp.991-997
- /
- 2020
Reinforcement learning has been applied mainly in sequential decision-making problems. Especially in recent years, reinforcement learning combined with neural networks has brought successful results in previously unsolved fields. However, reinforcement learning using deep neural networks has the disadvantage that it is too complex for immediate use in the field. In this paper, we implemented path planning algorithm for mobile robots using Q-learning, one of the easy-to-learn reinforcement learning algorithms. We used real-time Q-learning to update the Q-table in real-time since the Q-learning method of generating Q-tables in advance has obvious limitations. By adjusting the exploration strategy, we were able to obtain the learning speed required for real-time Q-learning. Finally, we compared the performance of real-time Q-learning and DQN.
https://doi.org/10.7471/ikeee.2020.24.4.991 인용 PDF KSCI

The Effect of Segment Size on Quality Selection in DQN-based Video Streaming Services (DQN 기반 비디오 스트리밍 서비스에서 세그먼트 크기가 품질 선택에 미치는 영향)

Kim, ISeul;Lim, Kyungshik
- Journal of Korea Multimedia Society
- /
- v.21 no.10
- /
- pp.1182-1194
- /
- 2018
The Dynamic Adaptive Streaming over HTTP(DASH) is envisioned to evolve to meet an increasing demand on providing seamless video streaming services in the near future. The DASH performance heavily depends on the client's adaptive quality selection algorithm that is not included in the standard. The existing conventional algorithms are basically based on a procedural algorithm that is not easy to capture and reflect all variations of dynamic network and traffic conditions in a variety of network environments. To solve this problem, this paper proposes a novel quality selection mechanism based on the Deep Q-Network(DQN) model, the DQN-based DASH Adaptive Bitrate(ABR) mechanism. The proposed mechanism adopts a new reward calculation method based on five major performance metrics to reflect the current conditions of networks and devices in real time. In addition, the size of the consecutive video segment to be downloaded is also considered as a major learning metric to reflect a variety of video encodings. Experimental results show that the proposed mechanism quickly selects a suitable video quality even in high error rate environments, significantly reducing frequency of quality changes compared to the existing algorithm and simultaneously improving average video quality during video playback.
https://doi.org/10.9717/kmms.2018.21.10.1182 인용 PDF KSCI

Search Result 17, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)