DOI QR코드

DOI QR Code

Performance Improvement Method of Convolutional Neural Network Using Agile Activation Function

민첩한 활성함수를 이용한 합성곱 신경망의 성능 향상

  • 공나영 (전주대학교 문화기술학과) ;
  • 고영민 (전주대학교 인공지능학과) ;
  • 고선우 (전주대학교 스마트미디어학과)
  • Received : 2020.03.27
  • Accepted : 2020.06.01
  • Published : 2020.07.31

Abstract

The convolutional neural network is composed of convolutional layers and fully connected layers. The nonlinear activation function is used in each layer of the convolutional layer and the fully connected layer. The activation function being used in a neural network is a function that simulates the method of transmitting information in a neuron that can transmit a signal and not send a signal if the input signal is above a certain criterion when transmitting a signal between neurons. The conventional activation function does not have a relationship with the loss function, so the process of finding the optimal solution is slow. In order to improve this, an agile activation function that generalizes the activation function is proposed. The agile activation function can improve the performance of the deep neural network in a way that selects the optimal agile parameter through the learning process using the primary differential coefficient of the loss function for the agile parameter in the backpropagation process. Through the MNIST classification problem, we have identified that agile activation functions have superior performance over conventional activation functions.

합성곱 신경망은 합성곱층과 완전연결층으로 구성되어 있다. 합성곱층과 완전연결층의 각 층에서는 비선형 활성함수를 사용하고 있다. 활성함수는 뉴런 간에 신호를 전달할 때 입력신호가 일정 기준 이상이면 신호를 전달하고 기준에 도달하지 못하면 신호를 보내지 않을 수 있는 뉴런의 정보전달 방법을 모사하는 함수이다. 기존의 활성함수는 손실함수와 관계성을 가지고 있지 않아 최적해를 찾아가는 과정이 늦어지는 점을 개선하기 위해 활성함수를 일반화한 민첩한 활성함수를 제안하였다. 민첩한 활성함수의 매개변수는 역전파 과정에서, 매개변수에 대한 손실함수의 1차 미분계수를 이용한 학습과정을 통해 최적의 매개변수를 선택하는 방법으로 손실함수를 감소시킴으로써 심층신경망의 성능을 향상시킬 수 있다. MNIST 분류문제를 통하여 민첩한 활성함수가 기존의 활성함수에 비해 우월한 성능을 가짐을 확인하였다.

Keywords

References

  1. D. Hubel and T. Wiesel. "Receptive Fields of Single Neurons in the cat's Striate Cortex," The Journal of Physiology, Vol.124, No.3, pp.574-591, 1959. https://doi.org/10.1113/jphysiol.1954.sp005130
  2. A. Krizhevsky, I. Sutskever and G. Hinton, "Imagenet Classification with Deep Convolution Neural Networks," NIPS Conference, pp.1097-1107, 2012.
  3. Charu C. Aggarwal, "Neural Networks and Deep Learning: A Textbook," Springer International Publishing AG.
  4. Chigozie Enyinna Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall, "Activation Functions: Comparison of Trends in Practice and Research for Deep Learning," arXiv:1811.03378v1 [cs.LG] 8 Nov. 2018.
  5. N. Y. Kong and S. W. Ko, "Agile Activation Functions in Deep Neural Networks," Working Paper, 2020.
  6. Nello Cristianini and John Shawe-Taylor, "An introduction to Support Vector Machines: and other kernel-based learning methods," Cambridge University Press, New York, NY, USA, 2000.
  7. Bekir Karlik and A Vehbi Olgac, "Performance Analysis of Various Activation Functions in Generalized MLP Architectures of Neural Networks," International Journal of Artificial Intelligence And Expert Systems (IJAE), Volume (1): Issue (4), 2011.