[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.29220/CSAM.2022.29.2.161

A review and comparison of convolution neural network models under a unified framework

Park, Jimin (Memory Business, Samsung Electronics)
Jung, Yoonsuh (Department of Statistics, Korea University)

Publication Information

Communications for Statistical Applications and Methods / v.29, no.2, 2022 , pp. 161-176 More about this Journal

Abstract

There has been active research in image classification using deep learning convolutional neural network (CNN) models. ImageNet large-scale visual recognition challenge (ILSVRC) (2010-2017) was one of the most important competitions that boosted the development of efficient deep learning algorithms. This paper introduces and compares six monumental models that achieved high prediction accuracy in ILSVRC. First, we provide a review of the models to illustrate their unique structure and characteristics of the models. We then compare those models under a unified framework. For this reason, additional devices that are not crucial to the structure are excluded. Four popular data sets with different characteristics are then considered to measure the prediction accuracy. By investigating the characteristics of the data sets and the models being compared, we provide some insight into the architectural features of the models.

Keywords

classification; convolutional neural network (CNN); ImageNet large-scale visual recognition challenge (ILSVRC); image data;

Citations & Related Records

Times Cited By KSCI : 2 (Citation Analysis)

Reference
Cited By KSCI

1	Zhang X, Zhou X, Lin M, and Sun J (2018). Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition, 6848-6856.
2	Wei W, Yiyang H, Ting Z, Hongmei L, Jin W, and Xin W (2020). A new image classification approach via improved mobilenet models with local receptive field expansion in shallow layers, Computational Intelligence and Neuroscience.
3	Xiao H, Rasul K, and Vollgraf R (2017). Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms, arXiv:1708.07747
4	Zagoruyko S and Komodakis N (2016). Wide residual networks. In Proceedings of the British Machine Vision Conference (BMVC), 87, 12.
5	Zoph B and Le VQ (2016). Neural architecture search with reinforcement learning, In CoRR.
6	Zoph B, Vasudevan V, Shlens J, and Le QV (2018). Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 8697-8710.
7	Deng J, Dong W, Socher R, Li LJ, Li K, and Fei-Fei L (2009). ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.
8	He K, Zhang X, Ren S, and Sun J (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.
9	Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, and Adam H (2017). Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications, arXiv preprint arXiv:1704.04861
10	Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, and Keutzer K (2016). SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and < 0.5 MB Model Size, arXiv preprint arXiv:1602.07360
11	Ioffe S and Szegedy C (2015). Batch normalization: accelerating deep network training by reducing internal covariate shift. In JMLR Workshop and Conference Proceedings, 37, 448-456.
12	Krizhevsky A, Nair V, and Hinton G (2014). The cifar-10 dataset, http://www.cs.toronto.edu/kriz/cifar
13	LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, and Jackel LD (1989). Backpropagation applied to handwritten zip code recognition, Neural computation, 1, 541-551. DOI
14	Netzer Y, Wang T, Coates A, Bissacco A, Wu B, and Ng AY (2011). Reading digits in natural images with unsupervised feature learning. In Advances in Neural Information Processing Systems (NIPS).
15	Chollet F (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1251-1258.
16	Mukkamala MC and Hein M (2017). Variants of RMSP rop and a dagrad with logarithmic regret bounds. In Proceedings of the 34th International Conference on Machine Learning, 70, 2545-2553.
17	Nair V and Hinton GE (2010). Rectified linear units improve restricted boltzmann machines. In ICML'10: Proceedings of the 27th International Conference on International Conference on Machine Learning.
18	Scherer D, Muller A, and Behnke S (2010). Evaluation of pooling operations in convolutional architectures for object recognition, International Conference on Artificial Neural Networks, 92-101.
19	Simonyan K and Zisserman A (2014). Very deep convolutional networks for large-scale image recognition, Computer Vision and Pattern Recognition, arXiv:1409.1556
20	Srivastava N, Hinton G, Krizhevsky A, Sutskever I, and Salakhutdinov R (2014). Dropout: a simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, 15, 1929-1958.
21	Szegedy C, Liu W, Jia Y, et al. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-9.
22	Szegedy C, Vanhoucke V, Ioffe S, Shlens J, andWojna Z (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2818-2826.