• Title/Summary/Keyword: Activation Functions

Search Result 652, Processing Time 0.027 seconds

Comparative analysis of activation functions within reinforcement learning for autonomous vehicles merging onto highways

  • Dongcheul Lee;Janise McNair
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.63-71
    • /
    • 2024
  • Deep reinforcement learning (RL) significantly influences autonomous vehicle development by optimizing decision-making and adaptation to complex driving environments through simulation-based training. In deep RL, an activation function is used, and various activation functions have been proposed, but their performance varies greatly depending on the application environment. Therefore, finding the optimal activation function according to the environment is important for effective learning. In this paper, we analyzed nine commonly used activation functions for RL to compare and evaluate which activation function is most effective when using deep RL for autonomous vehicles to learn highway merging. To do this, we built a performance evaluation environment and compared the average reward of each activation function. The results showed that the highest reward was achieved using Mish, and the lowest using SELU. The difference in reward between the two activation functions was 10.3%.

Effect of Nonlinear Transformations on Entropy of Hidden Nodes

  • Oh, Sang-Hoon
    • International Journal of Contents
    • /
    • v.10 no.1
    • /
    • pp.18-22
    • /
    • 2014
  • Hidden nodes have a key role in the information processing of feed-forward neural networks in which inputs are processed through a series of weighted sums and nonlinear activation functions. In order to understand the role of hidden nodes, we must analyze the effect of the nonlinear activation functions on the weighted sums to hidden nodes. In this paper, we focus on the effect of nonlinear functions in a viewpoint of information theory. Under the assumption that the nonlinear activation function can be approximated piece-wise linearly, we prove that the entropy of weighted sums to hidden nodes decreases after piece-wise linear functions. Therefore, we argue that the nonlinear activation function decreases the uncertainty among hidden nodes. Furthermore, the more the hidden nodes are saturated, the more the entropy of hidden nodes decreases. Based on this result, we can say that, after successful training of feed-forward neural networks, hidden nodes tend not to be in linear regions but to be in saturated regions of activation function with the effect of uncertainty reduction.

The Effect of regularization and identity mapping on the performance of activation functions (정규화 및 항등사상이 활성함수 성능에 미치는 영향)

  • Ryu, Seo-Hyeon;Yoon, Jae-Bok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.10
    • /
    • pp.75-80
    • /
    • 2017
  • In this paper, we describe the effect of the regularization method and the network with identity mapping on the performance of the activation functions in deep convolutional neural networks. The activation functions act as nonlinear transformation. In early convolutional neural networks, a sigmoid function was used. To overcome the problem of the existing activation functions such as gradient vanishing, various activation functions were developed such as ReLU, Leaky ReLU, parametric ReLU, and ELU. To solve the overfitting problem, regularization methods such as dropout and batch normalization were developed on the sidelines of the activation functions. Additionally, data augmentation is usually applied to deep learning to avoid overfitting. The activation functions mentioned above have different characteristics, but the new regularization method and the network with identity mapping were validated only using ReLU. Therefore, we have experimentally shown the effect of the regularization method and the network with identity mapping on the performance of the activation functions. Through this analysis, we have presented the tendency of the performance of activation functions according to regularization and identity mapping. These results will reduce the number of training trials to find the best activation function.

Comparison of Reinforcement Learning Activation Functions to Maximize Rewards in Autonomous Highway Driving (고속도로 자율주행 시 보상을 최대화하기 위한 강화 학습 활성화 함수 비교)

  • Lee, Dongcheul
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.5
    • /
    • pp.63-68
    • /
    • 2022
  • Autonomous driving technology has recently made great progress with the introduction of deep reinforcement learning. In order to effectively use deep reinforcement learning, it is important to select the appropriate activation function. In the meantime, many activation functions have been presented, but they show different performance depending on the environment to be applied. This paper compares and evaluates the performance of 12 activation functions to see which activation functions are effective when using reinforcement learning to learn autonomous driving on highways. To this end, a performance evaluation method was presented and the average reward value of each activation function was compared. As a result, when using GELU, the highest average reward could be obtained, and SiLU showed the lowest performance. The average reward difference between the two activation functions was 20%.

Beta and Alpha Regularizers of Mish Activation Functions for Machine Learning Applications in Deep Neural Networks

  • Mathayo, Peter Beatus;Kang, Dae-Ki
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.1
    • /
    • pp.136-141
    • /
    • 2022
  • A very complex task in deep learning such as image classification must be solved with the help of neural networks and activation functions. The backpropagation algorithm advances backward from the output layer towards the input layer, the gradients often get smaller and smaller and approach zero which eventually leaves the weights of the initial or lower layers nearly unchanged, as a result, the gradient descent never converges to the optimum. We propose a two-factor non-saturating activation functions known as Bea-Mish for machine learning applications in deep neural networks. Our method uses two factors, beta (𝛽) and alpha (𝛼), to normalize the area below the boundary in the Mish activation function and we regard these elements as Bea. Bea-Mish provide a clear understanding of the behaviors and conditions governing this regularization term can lead to a more principled approach for constructing better performing activation functions. We evaluate Bea-Mish results against Mish and Swish activation functions in various models and data sets. Empirical results show that our approach (Bea-Mish) outperforms native Mish using SqueezeNet backbone with an average precision (AP50val) of 2.51% in CIFAR-10 and top-1accuracy in ResNet-50 on ImageNet-1k. shows an improvement of 1.20%.

Performance Improvement Method of Deep Neural Network Using Parametric Activation Functions (파라메트릭 활성함수를 이용한 심층신경망의 성능향상 방법)

  • Kong, Nayoung;Ko, Sunwoo
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.3
    • /
    • pp.616-625
    • /
    • 2021
  • Deep neural networks are an approximation method that approximates an arbitrary function to a linear model and then repeats additional approximation using a nonlinear active function. In this process, the method of evaluating the performance of approximation uses the loss function. Existing in-depth learning methods implement approximation that takes into account loss functions in the linear approximation process, but non-linear approximation phases that use active functions use non-linear transformation that is not related to reduction of loss functions of loss. This study proposes parametric activation functions that introduce scale parameters that can change the scale of activation functions and location parameters that can change the location of activation functions. By introducing parametric activation functions based on scale and location parameters, the performance of nonlinear approximation using activation functions can be improved. The scale and location parameters in each hidden layer can improve the performance of the deep neural network by determining parameters that minimize the loss function value through the learning process using the primary differential coefficient of the loss function for the parameters in the backpropagation. Through MNIST classification problems and XOR problems, parametric activation functions have been found to have superior performance over existing activation functions.

Performance Improvement Method of Fully Connected Neural Network Using Combined Parametric Activation Functions (결합된 파라메트릭 활성함수를 이용한 완전연결신경망의 성능 향상)

  • Ko, Young Min;Li, Peng Hang;Ko, Sun Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.1
    • /
    • pp.1-10
    • /
    • 2022
  • Deep neural networks are widely used to solve various problems. In a fully connected neural network, the nonlinear activation function is a function that nonlinearly transforms the input value and outputs it. The nonlinear activation function plays an important role in solving the nonlinear problem, and various nonlinear activation functions have been studied. In this study, we propose a combined parametric activation function that can improve the performance of a fully connected neural network. Combined parametric activation functions can be created by simply adding parametric activation functions. The parametric activation function is a function that can be optimized in the direction of minimizing the loss function by applying a parameter that converts the scale and location of the activation function according to the input data. By combining the parametric activation functions, more diverse nonlinear intervals can be created, and the parameters of the parametric activation functions can be optimized in the direction of minimizing the loss function. The performance of the combined parametric activation function was tested through the MNIST classification problem and the Fashion MNIST classification problem, and as a result, it was confirmed that it has better performance than the existing nonlinear activation function and parametric activation function.

Comparison of Image Classification Performance by Activation Functions in Convolutional Neural Networks (컨벌루션 신경망에서 활성 함수가 미치는 영상 분류 성능 비교)

  • Park, Sung-Wook;Kim, Do-Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.10
    • /
    • pp.1142-1149
    • /
    • 2018
  • Recently, computer vision application is increasing by using CNN which is one of the deep learning algorithms. However, CNN does not provide perfect classification performance due to gradient vanishing problem. Most of CNN algorithms use an activation function called ReLU to mitigate the gradient vanishing problem. In this study, four activation functions that can replace ReLU were applied to four different structural networks. Experimental results show that ReLU has the lowest performance in accuracy, loss rate, and speed of initial learning convergence from 20 experiments. It is concluded that the optimal activation function varied from network to network but the four activation functions were higher than ReLU.

Comparison of Reinforcement Learning Activation Functions to Improve the Performance of the Racing Game Learning Agent

  • Lee, Dongcheul
    • Journal of Information Processing Systems
    • /
    • v.16 no.5
    • /
    • pp.1074-1082
    • /
    • 2020
  • Recently, research has been actively conducted to create artificial intelligence agents that learn games through reinforcement learning. There are several factors that determine performance when the agent learns a game, but using any of the activation functions is also an important factor. This paper compares and evaluates which activation function gets the best results if the agent learns the game through reinforcement learning in the 2D racing game environment. We built the agent using a reinforcement learning algorithm and a neural network. We evaluated the activation functions in the network by switching them together. We measured the reward, the output of the advantage function, and the output of the loss function while training and testing. As a result of performance evaluation, we found out the best activation function for the agent to learn the game. The difference between the best and the worst was 35.4%.

Performance Improvement Method of Convolutional Neural Network Using Combined Parametric Activation Functions (결합된 파라메트릭 활성함수를 이용한 합성곱 신경망의 성능 향상)

  • Ko, Young Min;Li, Peng Hang;Ko, Sun Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.9
    • /
    • pp.371-380
    • /
    • 2022
  • Convolutional neural networks are widely used to manipulate data arranged in a grid, such as images. A general convolutional neural network consists of a convolutional layers and a fully connected layers, and each layer contains a nonlinear activation functions. This paper proposes a combined parametric activation function to improve the performance of convolutional neural networks. The combined parametric activation function is created by adding the parametric activation functions to which parameters that convert the scale and location of the activation function are applied. Various nonlinear intervals can be created according to parameters that convert multiple scales and locations, and parameters can be learned in the direction of minimizing the loss function calculated by the given input data. As a result of testing the performance of the convolutional neural network using the combined parametric activation function on the MNIST, Fashion MNIST, CIFAR10 and CIFAR100 classification problems, it was confirmed that it had better performance than other activation functions.