Browse > Article
http://dx.doi.org/10.5392/JKCA.2021.21.03.616

Performance Improvement Method of Deep Neural Network Using Parametric Activation Functions  

Kong, Nayoung (전주대학교 문화기술학과)
Ko, Sunwoo (전주대학교 스마트미디어학과)
Publication Information
Abstract
Deep neural networks are an approximation method that approximates an arbitrary function to a linear model and then repeats additional approximation using a nonlinear active function. In this process, the method of evaluating the performance of approximation uses the loss function. Existing in-depth learning methods implement approximation that takes into account loss functions in the linear approximation process, but non-linear approximation phases that use active functions use non-linear transformation that is not related to reduction of loss functions of loss. This study proposes parametric activation functions that introduce scale parameters that can change the scale of activation functions and location parameters that can change the location of activation functions. By introducing parametric activation functions based on scale and location parameters, the performance of nonlinear approximation using activation functions can be improved. The scale and location parameters in each hidden layer can improve the performance of the deep neural network by determining parameters that minimize the loss function value through the learning process using the primary differential coefficient of the loss function for the parameters in the backpropagation. Through MNIST classification problems and XOR problems, parametric activation functions have been found to have superior performance over existing activation functions.
Keywords
Deep Neural Network; Classification; Parametric Activation Function; Backpropagation; Learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Chigozie Enyinna Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall, "Activation Functions: Comparison of Trends in Practice and Research for Deep Learning," arXiv:1811.03378v1 [cs.LG] 8 Nov. 2018.
2 Garrett Bingham and Risto Miikkulainen, "Discovering parametric activation functions," arXiv: 2006.03179v2 [csLG] 6 Oct. 2020
3 Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Neural Information processing Systems, Conference, 2012.
4 David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams, "Learning representations by back-propagating errors," Nature, Vol.323, 1986.
5 Charu C. Aggarwal, Neural Networks and Deep Learning: A Textbook, Springer International Publishing AG, 2018.
6 Stephen I. Gallant, "Perceptron-Based Learning Algorithms," IEEE Transactions on Neural Networks, Vol.1. No.2, 1990.
7 Kevin P. Murphy, Machine Learning: A Probabilistic Perspective, The MIT Press, New York, NY, USA, 2012.
8 Nello Cristianini and John Shawe-Taylor. An introduction to Support Vector Machines: and other kernel-based learning methods. Cambridge University Press, New York, NY, USA, 2000.
9 Leonid Datta, "A Survey on Activation Functions and their relation with Xavier and He Normal Initialization," arXiv: 2004.06632v1 [cs.NE] 18 Mar 2020.
10 A. Wieland and R. Leighton, "Geometric analysis of neural network capacity," Proc. IEEE First ICNN, 1, 1987.
11 T. Chen, H. Chen, and R. Liu, "A constructive proof of Cybenko's approximation theorem and its extensions," Proc. 22nd Symp. Interface, 1990.
12 T. Chen, H. Chen, and R. Liu, "Approximation capability in CR by multilayer feedforward networks and related problems," IEEE Trans. Neural Networks, Vol.6, No.1, Jan. 1995.
13 K. Homik, M. Stinchcombe, and H. White, "Multi-layer feedforward networks are universal approximators," Neural Networks, Vol.2, 1989.
14 H. N. Mhaskar and C. A. Micchelli, "Approximations for nonlinear functionals," IEEE Trans. Circuits Syst., Vol.39, No.1, pp.65-67, Jan. 1992.
15 P. Sibi, S. A. Jones, and P. Siddarth, "Analysis of different Activation functions," Journal of Theoretical and Applied Information Technology, Vol.47, No.3, 2013.