Browse > Article
http://dx.doi.org/10.9717/kmms.2018.21.4.441

Performance Comparison of Convolution Neural Network by Weight Initialization and Parameter Update Method1  

Park, Sung-Wook (Dept. of Computer Eng., Sunchon National University)
Kim, Do-Yeon (Dept. of Computer Eng., Sunchon National University)
Publication Information
Abstract
Deep learning has been used for various processing centered on image recognition. One core algorithms of the deep learning, convolutional neural network is an deep neural network that specialized in image recognition. In this paper, we use a convolutional neural network to classify forest insects and propose an optimization method. Experiments were carried out by combining two weight initialization and six parameter update methods. As a result, the Xavier-SGD method showed the highest performance with an accuracy of 82.53% in the 12 different combinations of experiments. Through this, the latest learning algorithms, which complement the disadvantages of the previous parameter update method, we conclude that it can not lead to higher performance than existing methods in all application environments.
Keywords
Convolutional Neural Network; Optimization; Weight Initialization; Parameter Update;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 L. Yann, B Yoshua, and G. Hinton, "Deep Learning," Nature, Vol. 521, No. 7553, pp. 436-444, 2015.   DOI
2 A. Krizhevsky, I. Sutskever, and G. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Neural Information Processing Systems, Vol. 1, No. 6, pp. 1097-1105, 2012.
3 Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour, and K. Yu, "Large-Scale Image Classification: Fast Feature Extraction and SVM Training," Proceeding of 2011 IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1689-1696, 2011.
4 J. Sanchez, F. Perronnin, and T. Mensink, "Improving the Fisher Kernel for Large-Scale Image Classification," Proceedings of the 11th European Conference on Computer Vision, pp. 143-156, 2010.
5 Y. Lecun, L. Bottou, and Y. Bengio, "Gradient-Based Learning Applied to Document Recognition," Proceeding of The IEEE, Vol. 86, No. 11, pp. 2278-2324, 1998.
6 K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," Computer Vision and Pattern Recognition, arXiv:1512.03385, 2015.
7 S. Zagoruyko and N. Komodakis, "Wide Residual Networks," Computer Vision and Pattern Recognition, arXiv: 1605.07146, 2016.
8 Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, et al., "Caffe: An Open Source Convolutional Architecture for Fast Feature Embedding," Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675-678, 2014.
9 X. Glorot and Y. Bengio, "Understanding the Difficulty of Training Deep Feedforward Neural Networks," Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 249-256, 2010.
10 V. Nair and G. Hinton, "Rectified Linear Units Improve Restricted Boltzmann Machines," Proceedings of the 27th International Conference on Machine Learning, pp. 807-814, 2010.
11 K. He, X. Zhang, S. Ren, and J. Sun, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," Proceedings of the International Conference on Computer Vision, pp. 1026-1034, 2015.
12 L. Bottou, "Stochastic Gradient Descent Tricks," Neural Network, Tricks of the Trade, Reloaded, Vol. 7700, pp. 430-445, 2012.
13 J. Duchi, E. Hazan, and Y. Singer, "Adaptive Subgradient Methods for Online Leaning and Stochastic Optimization," Proceeding of International Conference on Learning Theory, pp. 2121-2159, 2010.
14 T. Tieleman and G. Hinton, RMSProp: Divide the Gradient by a Running Average of Its Recent Magnitude, COURSERA: Neural Networks for Machine Learning Technical Report, 2012.
15 I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the Importance of Initialization and Momentum in Deep Learning," Proceeding of the 30th International Conference on Machine Learning, Vol. 28, pp. 1139-1147, 2013.
16 N. Qian, "On the Momentum Term in Gradient Descent Learning Algorithms," Neural Networks: The Official Journal of the International Neural Network Society, Vol. 12, No. 1, pp. 145-151, 1999.   DOI
17 C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. reed, D. Anguelov, et al., "Going Deeper with Convolutions," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.
18 D.P. Kingma and L.J. Ba, "Adam: A Method for Stochastic Optimization," Proceeding of International Conference on Learning Representations, arXiv:1412.6980, 2015.
19 O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, et al., "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, Vol. 115, No. 3, pp. 211-252, 2015.   DOI
20 Caffe, http://caffe.berkeleyvision.org/tutorial/solver.html (accessed Jun., 20, 2014).
21 G. Huang, Z. Liu, L. v. d. Maaten and K. Q. Weinberger, "Densely Connected Convolution al Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261-2269, 2017.
22 Y. Jeong, l. Ansari, J. Shim, and J. Lee, "A Car Plate Area Detection System Using Deep Convolution Neural Network", Journal of Korea Multimedia Society, Vol. 20, No. 8, pp. 1166-1174, 2017.   DOI
23 M. Zeiler, "ADADELTA: An Adaptive Learning Rate Method," Learning, arXiv: 1212.5701, 2012.