Performance Improvement Method of Convolutional Neural Network Using Combined Parametric Activation Functions

Ko, Young Min;Li, Peng Hang;Ko, Sun Woo;

doi:10.3745/KTSDE.2022.11.9.371

KIPS Transactions on Software and Data Engineering (정보처리학회논문지:소프트웨어 및 데이터공학)

Volume 11 Issue 9
/
Pages.371-380
/
2022
/
2287-5905(pISSN)
/
2734-0503(eISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Performance Improvement Method of Convolutional Neural Network Using Combined Parametric Activation Functions

결합된 파라메트릭 활성함수를 이용한 합성곱 신경망의 성능 향상

고영민 (전주대학교 인공지능연구소) ;
이붕항 (전주대학교 인공지능학과) ;
고선우 (전주대학교 인공지능학과)

Received : 2021.12.16
Accepted : 2022.03.18
Published : 2022.09.30

https://doi.org/10.3745/KTSDE.2022.11.9.371 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Convolutional neural networks are widely used to manipulate data arranged in a grid, such as images. A general convolutional neural network consists of a convolutional layers and a fully connected layers, and each layer contains a nonlinear activation functions. This paper proposes a combined parametric activation function to improve the performance of convolutional neural networks. The combined parametric activation function is created by adding the parametric activation functions to which parameters that convert the scale and location of the activation function are applied. Various nonlinear intervals can be created according to parameters that convert multiple scales and locations, and parameters can be learned in the direction of minimizing the loss function calculated by the given input data. As a result of testing the performance of the convolutional neural network using the combined parametric activation function on the MNIST, Fashion MNIST, CIFAR10 and CIFAR100 classification problems, it was confirmed that it had better performance than other activation functions.

합성곱 신경망은 이미지와 같은 격자 형태로 배열된 데이터를 다루는데 널리 사용되고 있는 신경망이다. 일반적인 합성곱 신경망은 합성곱층과 완전연결층으로 구성되며 각 층은 비선형활성함수를 포함하고 있다. 본 논문은 합성곱 신경망의 성능을 향상시키기 위해 결합된 파라메트릭 활성함수를 제안한다. 결합된 파라메트릭 활성함수는 활성함수의 크기와 위치를 변환시키는 파라미터를 적용한 파라메트릭 활성함수들을 여러 번 더하여 만들어진다. 여러 개의 크기, 위치를 변환하는 파라미터에 따라 다양한 비선형간격을 만들 수 있으며, 파라미터는 주어진 입력데이터에 의해 계산된 손실함수를 최소화하는 방향으로 학습할 수 있다. 결합된 파라메트릭 활성함수를 사용한 합성곱 신경망의 성능을 MNIST, Fashion MNIST, CIFAR10 그리고 CIFAR100 분류문제에 대해 실험한 결과, 다른 활성함수들보다 우수한 성능을 가짐을 확인하였다.

Keywords

References

Y. Bengio, I. Goodfellow, and A. Courville, "Deep learning," MIT Press, 2017.
C. A. Charu, "Neural Networks and Deep Learning: A Textbook," Springer International Publishing AG, 2018.
Y. M. Ko, P. H. Li, and S. W. Ko, "Performance improvement method of fully connected neural network using combined parametric activation functions," KIPS Transactions on Software and Data Engineering, Vol.11, No.1, pp.1-10, 2022. https://doi.org/10.3745/KTSDE.2022.11.1.1
N. Y. Kong and S. W. Ko, "Performance improvement method of deep neural network using parametric activation functions," Journal of the Korea Contents Association, Vol.21, No.3, pp616-625, 2021. https://doi.org/10.5392/JKCA.2021.21.03.616
N. Y. Kong, Y. M. Ko, and S. W. Ko, "Performance improvement method of convolutional neural network using agileactivation function," KIPS Transactions on Software and Data Engineering, Vol.9, No.7, pp.213-220, 2020. https://doi.org/10.3745/KTSDE.2020.9.7.213
Y. M. Ko and S. W. Ko, "Alleviation of vanishing gradient problem using parametric activation functions," KIPS Transactions on Softward and Data Engineering, Vol.10, No. 10, pp.407-420, 2021.
A. Apicella, F. Donnarumma, F. Isgro, and R. Prevete, "A survey on modern trainable activation functions," Neural Networks, Vol.138, pp.14-32, 2021. https://doi.org/10.1016/j.neunet.2021.01.026
V. Nair and G. Hinton, "Rectified linear units improve restricted boltzmann machines," In Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML), pp.807-814, 2010.
M. Roodschild, J. Gotay Sardinas, and A. Will, "A new approach for the vanishing gradient problem on sigmoid activation," Springer Nature, Vol.20, Iss.4, pp.351-360, 2020.
S. Hochreiter, "The vanishing gradient problem during learning recurrent neural nets and problem solutions," International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol.6, No.2, pp.107-116, 1998. https://doi.org/10.1142/S0218488598000094
S. Kong and M. Takatsuka, "Hexpo: A vanishing-proof activation function," International Joint Conference on Neural Networks, pp.2562-2567, 2017.
Y. Qin, X. Wang, and J. Zou, "The optimized deep belief networkswith improved logistic Sigmoid units and their application in faultdiagnosis for planetary gearboxes of wind turbines," IEEE Transactions on Industrial Electronics, Vol.66, No.5, pp.3814-3824, 2018. https://doi.org/10.1109/tie.2018.2856205
X. Wang, Y. Qin, Y. Wang, S. Xiang, and H. Chen, "ReLTanh: An activation function with vanishing gradient resistance for SAE-based DNNs and its application to rotating machinery fault diagnosis," Neurocomputing, Vol.363, pp.88-98, 2019. https://doi.org/10.1016/j.neucom.2019.07.017
B. Xu, N. Wang, T. Chen, and M. Li, "Empirical evaluation of rectified activations in convolutional network," arXiv: 1505.00853, 2015.
S. Qian, H. Liu, C. Liu, S. Wu, and H. Wong, "Adaptive activation functions in convolutional neural networks," Neurocomputing, Vol.272, pp.204-212, 2017. https://doi.org/10.1016/j.neucom.2017.06.070
K. He, X. Zhang, S. Ren, and J. Sun, "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification," arXiv:1502.01852, 2015.
D. Clevert, T. Unterthiner, and S. Hochreiter, "Fast and accurate deep network learning by exponential linear units (ELUs)," arXiv:1511.07289, 2016.

KIPS Transactions on Software and Data Engineering (정보처리학회논문지:소프트웨어 및 데이터공학)

Performance Improvement Method of Convolutional Neural Network Using Combined Parametric Activation Functions

결합된 파라메트릭 활성함수를 이용한 합성곱 신경망의 성능 향상

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)