Search | Korea Science

Learning Less Random to Learn Better in Deep Reinforcement Learning with Noisy Parameters

Kim, Chayoung
- Journal of Advanced Information Technology and Convergence
- /
- v.9 no.1
- /
- pp.127-134
- /
- 2019
In terms of deep Reinforcement Learning (RL), exploration can be worked stochastically in the action of a state space. On the other hands, exploitation can be done the proportion of well generalization behaviors. The balance of exploration and exploitation is extremely important for better results. The randomly selected action with ε-greedy for exploration has been regarded as a de facto method. There is an alternative method to add noise parameters into a neural network for richer exploration. However, it is not easy to predict or detect over-fitting with the stochastically exploration in the perturbed neural network. Moreover, the well-trained agents in RL do not necessarily prevent or detect over-fitting in the neural network. Therefore, we suggest a novel design of a deep RL by the balance of the exploration with drop-out to reduce over-fitting in the perturbed neural networks.
https://doi.org/10.14801/JAITC.2018.9.1.127 인용

Comparative analysis of activation functions within reinforcement learning for autonomous vehicles merging onto highways

Dongcheul Lee;Janise McNair
- International Journal of Internet, Broadcasting and Communication
- /
- v.16 no.1
- /
- pp.63-71
- /
- 2024
Deep reinforcement learning (RL) significantly influences autonomous vehicle development by optimizing decision-making and adaptation to complex driving environments through simulation-based training. In deep RL, an activation function is used, and various activation functions have been proposed, but their performance varies greatly depending on the application environment. Therefore, finding the optimal activation function according to the environment is important for effective learning. In this paper, we analyzed nine commonly used activation functions for RL to compare and evaluate which activation function is most effective when using deep RL for autonomous vehicles to learn highway merging. To do this, we built a performance evaluation environment and compared the average reward of each activation function. The results showed that the highest reward was achieved using Mish, and the lowest using SELU. The difference in reward between the two activation functions was 10.3%.
https://doi.org/10.7236/IJIBC.2024.16.1.63 인용 PDF

Comparison of Reinforcement Learning Activation Functions to Maximize Rewards in Autonomous Highway Driving (고속도로 자율주행 시 보상을 최대화하기 위한 강화 학습 활성화 함수 비교)

Lee, Dongcheul
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.22 no.5
- /
- pp.63-68
- /
- 2022
Autonomous driving technology has recently made great progress with the introduction of deep reinforcement learning. In order to effectively use deep reinforcement learning, it is important to select the appropriate activation function. In the meantime, many activation functions have been presented, but they show different performance depending on the environment to be applied. This paper compares and evaluates the performance of 12 activation functions to see which activation functions are effective when using reinforcement learning to learn autonomous driving on highways. To this end, a performance evaluation method was presented and the average reward value of each activation function was compared. As a result, when using GELU, the highest average reward could be obtained, and SiLU showed the lowest performance. The average reward difference between the two activation functions was 20%.
https://doi.org/10.7236/JIIBC.2022.22.5.63 인용 PDF KSCI HTML

Two tales of platoon intelligence for autonomous mobility control: Enabling deep learning recipes

Soohyun Park;Haemin Lee;Chanyoung Park;Soyi Jung;Minseok Choi;Joongheon Kim
- ETRI Journal
- /
- v.45 no.5
- /
- pp.735-745
- /
- 2023
This paper surveys recent multiagent reinforcement learning and neural Myerson auction deep learning efforts to improve mobility control and resource management in autonomous ground and aerial vehicles. The multiagent reinforcement learning communication network (CommNet) was introduced to enable multiple agents to perform actions in a distributed manner to achieve shared goals by training all agents' states and actions in a single neural network. Additionally, the Myerson auction method guarantees trustworthiness among multiple agents to optimize rewards in highly dynamic systems. Our findings suggest that the integration of MARL CommNet and Myerson techniques is very much needed for improved efficiency and trustworthiness.
https://doi.org/10.4218/etrij.2023-0132 인용 PDF

Comparison of Activation Functions using Deep Reinforcement Learning for Autonomous Driving on Intersection (교차로에서 자율주행을 위한 심층 강화 학습 활성화 함수 비교 분석)

Lee, Dongcheul
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.21 no.6
- /
- pp.117-122
- /
- 2021
Autonomous driving allows cars to drive without people and is being studied very actively thanks to the recent development of artificial intelligence technology. Among artificial intelligence technologies, deep reinforcement learning is used most effectively. Deep reinforcement learning requires us to build a neural network using an appropriate activation function. So far, many activation functions have been suggested, but different performances have been shown depending on the field of application. This paper compares and evaluates the performance of which activation function is effective when using deep reinforcement learning to learn autonomous driving on highways. To this end, the performance metrics to be used in the evaluation were defined and the values of the metrics according to each activation function were compared in graphs. As a result, when Mish was used, the reward was higher on average than other activation functions, and the difference from the activation function with the lowest reward was 9.8%.
https://doi.org/10.7236/JIIBC.2021.21.6.117 인용 PDF KSCI HTML

Autonomous control of bicycle using Deep Deterministic Policy Gradient Algorithm (Deep Deterministic Policy Gradient 알고리즘을 응용한 자전거의 자율 주행 제어)

Choi, Seung Yoon;Le, Pham Tuyen;Chung, Tae Choong
- Convergence Security Journal
- /
- v.18 no.3
- /
- pp.3-9
- /
- 2018
The Deep Deterministic Policy Gradient (DDPG) algorithm is an algorithm that learns by using artificial neural network s and reinforcement learning. Among the studies related to reinforcement learning, which has been recently studied, the D DPG algorithm has an advantage of preventing the cases where the wrong actions are accumulated and affecting the learn ing because it is learned by the off-policy. In this study, we experimented to control the bicycle autonomously by applyin g the DDPG algorithm. Simulation was carried out by setting various environments and it was shown that the method us ed in the experiment works stably on the simulation.
PDF

GAN-based Color Palette Extraction System by Chroma Fine-tuning with Reinforcement Learning

Kim, Sanghyuk;Kang, Suk-Ju
- Journal of Semiconductor Engineering
- /
- v.2 no.1
- /
- pp.125-129
- /
- 2021
As the interest of deep learning, techniques to control the color of images in image processing field are evolving together. However, there is no clear standard for color, and it is not easy to find a way to represent only the color itself like the color-palette. In this paper, we propose a novel color palette extraction system by chroma fine-tuning with reinforcement learning. It helps to recognize the color combination to represent an input image. First, we use RGBY images to create feature maps by transferring the backbone network with well-trained model-weight which is verified at super resolution convolutional neural networks. Second, feature maps are trained to 3 fully connected layers for the color-palette generation with a generative adversarial network (GAN). Third, we use the reinforcement learning method which only changes chroma information of the GAN-output by slightly moving each Y component of YCbCr color gamut of pixel values up and down. The proposed method outperforms existing color palette extraction methods as given the accuracy of 0.9140.
https://doi.org/10.22895/jse.2020.0101 인용 PDF KSCI

On the Reward Function of Latent SAC Reinforcement Learning to Improve Longitudinal Driving Performance (종방향 주행성능향상을 위한 Latent SAC 강화학습 보상함수 설계)

Jo, Sung-Bean;Jeong, Han-You
- Journal of IKEEE
- /
- v.25 no.4
- /
- pp.728-734
- /
- 2021
In recent years, there has been a strong interest in the end-to-end autonomous driving based on deep reinforcement learning. In this paper, we present a reward function of latent SAC deep reinforcement learning to improve the longitudinal driving performance of an agent vehicle. While the existing reward function significantly degrades the driving safety and efficiency, the proposed reward function is shown to maintain an appropriate headway distance while avoiding the front vehicle collision.
https://doi.org/10.7471/ikeee.2021.25.4.728 인용 PDF KSCI

Multi-Agent Deep Reinforcement Learning for Fighting Game: A Comparative Study of PPO and A2C

Yoshua Kaleb Purwanto;Dae-Ki Kang
- International Journal of Internet, Broadcasting and Communication
- /
- v.16 no.3
- /
- pp.192-198
- /
- 2024
This paper investigates the application of multi-agent deep reinforcement learning in the fighting game Samurai Shodown using Proximal Policy Optimization (PPO) and Advantage Actor-Critic (A2C) algorithms. Initially, agents are trained separately for 200,000 timesteps using Convolutional Neural Network (CNN) and Multi-Layer Perceptron (MLP) with LSTM networks. PPO demonstrates superior performance early on with stable policy updates, while A2C shows better adaptation and higher rewards over extended training periods, culminating in A2C outperforming PPO after 1,000,000 timesteps. These findings highlight PPO's effectiveness for short-term training and A2C's advantages in long-term learning scenarios, emphasizing the importance of algorithm selection based on training duration and task complexity. The code can be found in this link https://github.com/Lexer04/Samurai-Shodown-with-Reinforcement-Learning-PPO.
https://doi.org/10.7236/IJIBC.2024.16.3.192 인용 PDF

Deep Reinforcement Learning based Tourism Experience Path Finding

Kyung-Hee Park;Juntae Kim
- Journal of Platform Technology
- /
- v.11 no.6
- /
- pp.21-27
- /
- 2023
In this paper, we introduce a reinforcement learning-based algorithm for personalized tourist path recommendations. The algorithm employs a reinforcement learning agent to explore tourist regions and identify optimal paths that are expected to enhance tourism experiences. The concept of tourism experience is defined through points of interest (POI) located along tourist paths within the tourist area. These metrics are quantified through aggregated evaluation scores derived from reviews submitted by past visitors. In the experimental setup, the foundational learning model used to find tour paths is the Deep Q-Network (DQN). Despite the limited availability of historical tourist behavior data, the agent adeptly learns travel paths by incorporating preference scores of tourist POIs and spatial information of the travel area.
PDF

Search Result 205, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)