Browse > Article
http://dx.doi.org/10.5762/KAIS.2017.18.10.75

The Effect of regularization and identity mapping on the performance of activation functions  

Ryu, Seo-Hyeon (Defense Agency of Technology and Quality)
Yoon, Jae-Bok (Defense Agency of Technology and Quality)
Publication Information
Journal of the Korea Academia-Industrial cooperation Society / v.18, no.10, 2017 , pp. 75-80 More about this Journal
Abstract
In this paper, we describe the effect of the regularization method and the network with identity mapping on the performance of the activation functions in deep convolutional neural networks. The activation functions act as nonlinear transformation. In early convolutional neural networks, a sigmoid function was used. To overcome the problem of the existing activation functions such as gradient vanishing, various activation functions were developed such as ReLU, Leaky ReLU, parametric ReLU, and ELU. To solve the overfitting problem, regularization methods such as dropout and batch normalization were developed on the sidelines of the activation functions. Additionally, data augmentation is usually applied to deep learning to avoid overfitting. The activation functions mentioned above have different characteristics, but the new regularization method and the network with identity mapping were validated only using ReLU. Therefore, we have experimentally shown the effect of the regularization method and the network with identity mapping on the performance of the activation functions. Through this analysis, we have presented the tendency of the performance of activation functions according to regularization and identity mapping. These results will reduce the number of training trials to find the best activation function.
Keywords
Activation function; Convolutional neural network; Deep learning; Identity mapping; Regularization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Srivastava, Nitish, et al., "Dropout: a simple way to prevent neural networks from overfitting," Journal of Machine Learning Research 15.1, pp. 1929-1958, 2014.
2 Wan, Li, et al., "Regularization of neural networks using dropconnect." Proceedings of the 30th international conference on machine learning (ICML-13), 2013.
3 Ioffe, Sergey, and Christian Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift." International Conference on Machine Learning, 2015.
4 He, Kaiming, et al., "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90
5 Nair, Vinod, and Geoffrey E. Hinton, "Rectified linear units improve restricted boltzmann machines." Proceedings of the 27th international conference on machine learning (ICML-10), 2010.
6 Maas, Andrew L., Awni Y. Hannun, and Andrew Y. Ng., "Rectifier nonlinearities improve neural network acoustic models." Proc. ICML, vol. 30, no. 1, 2013.
7 He, Kaiming, et al., "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification." Proceedings of the IEEE international conference on computer vision, 2015. DOI: https://doi.org/10.1109/ICCV.2015.123   DOI
8 Simonyan, Karen, and Andrew Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
9 Clevert, Djork-Arne, Thomas Unterthiner, and Sepp Hochreiter, "Fast and accurate deep network learning by exponential linear units (elus)," arXiv preprint arXiv: 1511.07289, 2015.