Search | Korea Science

Learning Less Random to Learn Better in Deep Reinforcement Learning with Noisy Parameters

Kim, Chayoung
- Journal of Advanced Information Technology and Convergence
- /
- v.9 no.1
- /
- pp.127-134
- /
- 2019
In terms of deep Reinforcement Learning (RL), exploration can be worked stochastically in the action of a state space. On the other hands, exploitation can be done the proportion of well generalization behaviors. The balance of exploration and exploitation is extremely important for better results. The randomly selected action with ε-greedy for exploration has been regarded as a de facto method. There is an alternative method to add noise parameters into a neural network for richer exploration. However, it is not easy to predict or detect over-fitting with the stochastically exploration in the perturbed neural network. Moreover, the well-trained agents in RL do not necessarily prevent or detect over-fitting in the neural network. Therefore, we suggest a novel design of a deep RL by the balance of the exploration with drop-out to reduce over-fitting in the perturbed neural networks.
https://doi.org/10.14801/JAITC.2018.9.1.127 인용

Comparative analysis of activation functions within reinforcement learning for autonomous vehicles merging onto highways

Dongcheul Lee;Janise McNair
- International Journal of Internet, Broadcasting and Communication
- /
- v.16 no.1
- /
- pp.63-71
- /
- 2024
Deep reinforcement learning (RL) significantly influences autonomous vehicle development by optimizing decision-making and adaptation to complex driving environments through simulation-based training. In deep RL, an activation function is used, and various activation functions have been proposed, but their performance varies greatly depending on the application environment. Therefore, finding the optimal activation function according to the environment is important for effective learning. In this paper, we analyzed nine commonly used activation functions for RL to compare and evaluate which activation function is most effective when using deep RL for autonomous vehicles to learn highway merging. To do this, we built a performance evaluation environment and compared the average reward of each activation function. The results showed that the highest reward was achieved using Mish, and the lowest using SELU. The difference in reward between the two activation functions was 10.3%.
https://doi.org/10.7236/IJIBC.2024.16.1.63 인용 PDF

Comparison of Reinforcement Learning Activation Functions to Maximize Rewards in Autonomous Highway Driving (고속도로 자율주행 시 보상을 최대화하기 위한 강화 학습 활성화 함수 비교)

Lee, Dongcheul
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.22 no.5
- /
- pp.63-68
- /
- 2022
Autonomous driving technology has recently made great progress with the introduction of deep reinforcement learning. In order to effectively use deep reinforcement learning, it is important to select the appropriate activation function. In the meantime, many activation functions have been presented, but they show different performance depending on the environment to be applied. This paper compares and evaluates the performance of 12 activation functions to see which activation functions are effective when using reinforcement learning to learn autonomous driving on highways. To this end, a performance evaluation method was presented and the average reward value of each activation function was compared. As a result, when using GELU, the highest average reward could be obtained, and SiLU showed the lowest performance. The average reward difference between the two activation functions was 20%.
https://doi.org/10.7236/JIIBC.2022.22.5.63 인용 PDF KSCI HTML

Two tales of platoon intelligence for autonomous mobility control: Enabling deep learning recipes

Soohyun Park;Haemin Lee;Chanyoung Park;Soyi Jung;Minseok Choi;Joongheon Kim
- ETRI Journal
- /
- v.45 no.5
- /
- pp.735-745
- /
- 2023
This paper surveys recent multiagent reinforcement learning and neural Myerson auction deep learning efforts to improve mobility control and resource management in autonomous ground and aerial vehicles. The multiagent reinforcement learning communication network (CommNet) was introduced to enable multiple agents to perform actions in a distributed manner to achieve shared goals by training all agents' states and actions in a single neural network. Additionally, the Myerson auction method guarantees trustworthiness among multiple agents to optimize rewards in highly dynamic systems. Our findings suggest that the integration of MARL CommNet and Myerson techniques is very much needed for improved efficiency and trustworthiness.
https://doi.org/10.4218/etrij.2023-0132 인용 PDF

Comparison of Activation Functions using Deep Reinforcement Learning for Autonomous Driving on Intersection (교차로에서 자율주행을 위한 심층 강화 학습 활성화 함수 비교 분석)

Lee, Dongcheul
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.21 no.6
- /
- pp.117-122
- /
- 2021
Autonomous driving allows cars to drive without people and is being studied very actively thanks to the recent development of artificial intelligence technology. Among artificial intelligence technologies, deep reinforcement learning is used most effectively. Deep reinforcement learning requires us to build a neural network using an appropriate activation function. So far, many activation functions have been suggested, but different performances have been shown depending on the field of application. This paper compares and evaluates the performance of which activation function is effective when using deep reinforcement learning to learn autonomous driving on highways. To this end, the performance metrics to be used in the evaluation were defined and the values of the metrics according to each activation function were compared in graphs. As a result, when Mish was used, the reward was higher on average than other activation functions, and the difference from the activation function with the lowest reward was 9.8%.
https://doi.org/10.7236/JIIBC.2021.21.6.117 인용 PDF KSCI HTML

Autonomous control of bicycle using Deep Deterministic Policy Gradient Algorithm (Deep Deterministic Policy Gradient 알고리즘을 응용한 자전거의 자율 주행 제어)

Choi, Seung Yoon;Le, Pham Tuyen;Chung, Tae Choong
- Convergence Security Journal
- /
- v.18 no.3
- /
- pp.3-9
- /
- 2018
The Deep Deterministic Policy Gradient (DDPG) algorithm is an algorithm that learns by using artificial neural network s and reinforcement learning. Among the studies related to reinforcement learning, which has been recently studied, the D DPG algorithm has an advantage of preventing the cases where the wrong actions are accumulated and affecting the learn ing because it is learned by the off-policy. In this study, we experimented to control the bicycle autonomously by applyin g the DDPG algorithm. Simulation was carried out by setting various environments and it was shown that the method us ed in the experiment works stably on the simulation.
PDF

GAN-based Color Palette Extraction System by Chroma Fine-tuning with Reinforcement Learning

Kim, Sanghyuk;Kang, Suk-Ju
- Journal of Semiconductor Engineering
- /
- v.2 no.1
- /
- pp.125-129
- /
- 2021
As the interest of deep learning, techniques to control the color of images in image processing field are evolving together. However, there is no clear standard for color, and it is not easy to find a way to represent only the color itself like the color-palette. In this paper, we propose a novel color palette extraction system by chroma fine-tuning with reinforcement learning. It helps to recognize the color combination to represent an input image. First, we use RGBY images to create feature maps by transferring the backbone network with well-trained model-weight which is verified at super resolution convolutional neural networks. Second, feature maps are trained to 3 fully connected layers for the color-palette generation with a generative adversarial network (GAN). Third, we use the reinforcement learning method which only changes chroma information of the GAN-output by slightly moving each Y component of YCbCr color gamut of pixel values up and down. The proposed method outperforms existing color palette extraction methods as given the accuracy of 0.9140.
https://doi.org/10.22895/jse.2020.0101 인용 PDF KSCI

On the Reward Function of Latent SAC Reinforcement Learning to Improve Longitudinal Driving Performance (종방향 주행성능향상을 위한 Latent SAC 강화학습 보상함수 설계)

Jo, Sung-Bean;Jeong, Han-You
- Journal of IKEEE
- /
- v.25 no.4
- /
- pp.728-734
- /
- 2021
In recent years, there has been a strong interest in the end-to-end autonomous driving based on deep reinforcement learning. In this paper, we present a reward function of latent SAC deep reinforcement learning to improve the longitudinal driving performance of an agent vehicle. While the existing reward function significantly degrades the driving safety and efficiency, the proposed reward function is shown to maintain an appropriate headway distance while avoiding the front vehicle collision.
https://doi.org/10.7471/ikeee.2021.25.4.728 인용 PDF KSCI

Effective Utilization of Domain Knowledge for Relational Reinforcement Learning (관계형 강화 학습을 위한 도메인 지식의 효과적인 활용)

Kang, MinKyo;Kim, InCheol
- KIPS Transactions on Software and Data Engineering
- /
- v.11 no.3
- /
- pp.141-148
- /
- 2022
Recently, reinforcement learning combined with deep neural network technology has achieved remarkable success in various fields such as board games such as Go and chess, computer games such as Atari and StartCraft, and robot object manipulation tasks. However, such deep reinforcement learning describes states, actions, and policies in vector representation. Therefore, the existing deep reinforcement learning has some limitations in generality and interpretability of the learned policy, and it is difficult to effectively incorporate domain knowledge into policy learning. On the other hand, dNL-RRL, a new relational reinforcement learning framework proposed to solve these problems, uses a kind of vector representation for sensor input data and lower-level motion control as in the existing deep reinforcement learning. However, for states, actions, and learned policies, It uses a relational representation with logic predicates and rules. In this paper, we present dNL-RRL-based policy learning for transportation mobile robots in a manufacturing environment. In particular, this study proposes a effective method to utilize the prior domain knowledge of human experts to improve the efficiency of relational reinforcement learning. Through various experiments, we demonstrate the performance improvement of the relational reinforcement learning by using domain knowledge as proposed in this paper.
https://doi.org/10.3745/KTSDE.2022.11.3.141 인용 PDF KSCI

Deep reinforcement learning for base station switching scheme with federated LSTM-based traffic predictions

Hyebin Park;Seung Hyun Yoon
- ETRI Journal
- /
- v.46 no.3
- /
- pp.379-391
- /
- 2024
To meet increasing traffic requirements in mobile networks, small base stations (SBSs) are densely deployed, overlapping existing network architecture and increasing system capacity. However, densely deployed SBSs increase energy consumption and interference. Although these problems already exist because of densely deployed SBSs, even more SBSs are needed to meet increasing traffic demands. Hence, base station (BS) switching operations have been used to minimize energy consumption while guaranteeing quality-of-service (QoS) for users. In this study, to optimize energy efficiency, we propose the use of deep reinforcement learning (DRL) to create a BS switching operation strategy with a traffic prediction model. First, a federated long short-term memory (LSTM) model is introduced to predict user traffic demands from user trajectory information. Next, the DRL-based BS switching operation scheme determines the switching operations for the SBSs using the predicted traffic demand. Experimental results confirm that the proposed scheme outperforms existing approaches in terms of energy efficiency, signal-to-interference noise ratio, handover metrics, and prediction performance.
https://doi.org/10.4218/etrij.2023-0065 인용 PDF

Search Result 199, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)