Browse > Article
http://dx.doi.org/10.29220/CSAM.2022.29.2.161

A review and comparison of convolution neural network models under a unified framework  

Park, Jimin (Memory Business, Samsung Electronics)
Jung, Yoonsuh (Department of Statistics, Korea University)
Publication Information
Communications for Statistical Applications and Methods / v.29, no.2, 2022 , pp. 161-176 More about this Journal
Abstract
There has been active research in image classification using deep learning convolutional neural network (CNN) models. ImageNet large-scale visual recognition challenge (ILSVRC) (2010-2017) was one of the most important competitions that boosted the development of efficient deep learning algorithms. This paper introduces and compares six monumental models that achieved high prediction accuracy in ILSVRC. First, we provide a review of the models to illustrate their unique structure and characteristics of the models. We then compare those models under a unified framework. For this reason, additional devices that are not crucial to the structure are excluded. Four popular data sets with different characteristics are then considered to measure the prediction accuracy. By investigating the characteristics of the data sets and the models being compared, we provide some insight into the architectural features of the models.
Keywords
classification; convolutional neural network (CNN); ImageNet large-scale visual recognition challenge (ILSVRC); image data;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Zhang X, Zhou X, Lin M, and Sun J (2018). Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition, 6848-6856.
2 Wei W, Yiyang H, Ting Z, Hongmei L, Jin W, and Xin W (2020). A new image classification approach via improved mobilenet models with local receptive field expansion in shallow layers, Computational Intelligence and Neuroscience.
3 Xiao H, Rasul K, and Vollgraf R (2017). Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms, arXiv:1708.07747
4 Zagoruyko S and Komodakis N (2016). Wide residual networks. In Proceedings of the British Machine Vision Conference (BMVC), 87, 12.
5 Zoph B and Le VQ (2016). Neural architecture search with reinforcement learning, In CoRR.
6 Zoph B, Vasudevan V, Shlens J, and Le QV (2018). Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 8697-8710.
7 Deng J, Dong W, Socher R, Li LJ, Li K, and Fei-Fei L (2009). ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.
8 He K, Zhang X, Ren S, and Sun J (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.
9 Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, and Adam H (2017). Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications, arXiv preprint arXiv:1704.04861
10 Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, and Keutzer K (2016). SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and < 0.5 MB Model Size, arXiv preprint arXiv:1602.07360
11 Ioffe S and Szegedy C (2015). Batch normalization: accelerating deep network training by reducing internal covariate shift. In JMLR Workshop and Conference Proceedings, 37, 448-456.
12 Krizhevsky A, Nair V, and Hinton G (2014). The cifar-10 dataset, http://www.cs.toronto.edu/kriz/cifar
13 LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, and Jackel LD (1989). Backpropagation applied to handwritten zip code recognition, Neural computation, 1, 541-551.   DOI
14 Netzer Y, Wang T, Coates A, Bissacco A, Wu B, and Ng AY (2011). Reading digits in natural images with unsupervised feature learning. In Advances in Neural Information Processing Systems (NIPS).
15 Chollet F (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1251-1258.
16 Mukkamala MC and Hein M (2017). Variants of RMSP rop and a dagrad with logarithmic regret bounds. In Proceedings of the 34th International Conference on Machine Learning, 70, 2545-2553.
17 Nair V and Hinton GE (2010). Rectified linear units improve restricted boltzmann machines. In ICML'10: Proceedings of the 27th International Conference on International Conference on Machine Learning.
18 Scherer D, Muller A, and Behnke S (2010). Evaluation of pooling operations in convolutional architectures for object recognition, International Conference on Artificial Neural Networks, 92-101.
19 Simonyan K and Zisserman A (2014). Very deep convolutional networks for large-scale image recognition, Computer Vision and Pattern Recognition, arXiv:1409.1556
20 Srivastava N, Hinton G, Krizhevsky A, Sutskever I, and Salakhutdinov R (2014). Dropout: a simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, 15, 1929-1958.
21 Szegedy C, Liu W, Jia Y, et al. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-9.
22 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, andWojna Z (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2818-2826.