DOI QR코드

DOI QR Code

An Improved Image Classification Using Batch Normalization and CNN

배치 정규화와 CNN을 이용한 개선된 영상분류 방법

  • Ji, Myunggeun (Department of Computer Science, Kyonggi University) ;
  • Chun, Junchul (Department of Computer Science, Kyonggi University) ;
  • Kim, Namgi (Department of Computer Science, Kyonggi University)
  • Received : 2017.12.26
  • Accepted : 2018.04.04
  • Published : 2018.06.30

Abstract

Deep learning is known as a method of high accuracy among several methods for image classification. In this paper, we propose a method of enhancing the accuracy of image classification using CNN with a batch normalization method for classification of images using deep CNN (Convolutional Neural Network). In this paper, we propose a method to add a batch normalization layer to existing neural networks to enhance the accuracy of image classification. Batch normalization is a method to calculate and move the average and variance of each batch for reducing the deflection in each layer. In order to prove the superiority of the proposed method, Accuracy and mAP are measured by image classification experiments using five image data sets SHREC13, MNIST, SVHN, CIFAR-10, and CIFAR-100. Experimental results showed that the CNN with batch normalization is better classification accuracy and mAP rather than using the conventional CNN.

딥 러닝은 영상 분류를 위한 여러 방법 중 높은 정확도를 보이는 방법으로 알려져 있다. 본 논문에서는 딥 러닝 방법 가운데 합성곱 신경망 (CNN:Convolutional Neural Network)을 이용하여 영상을 분류함에 있어 배치 정규화 방법이 추가된 CNN을 이용하여 영상 분류의 정확도를 높이는 방법을 제시하였다. 본 논문에서는 영상 분류를 더 정확하게 수행하기 위해 기존의 뉴럴 네트워크에 배치 정규화 계층 (layer)를 추가하는 방법을 제안한다. 배치 정규화는 각 계층에 존재하는 편향을 줄이기 위해 고안된 방법으로, 각 배치의 평균과 분산을 계산하여 이동시키는 방법이다. 본 논문에서 제시된 방법의 우수성을 입증하기 위하여 SHREC13, MNIST, SVHN, CIFAR-10, CIFAR-100의 5개 영상 데이터 집합을 이용하여 영상분류 실험을 하여 정확도와 mAP를 측정한다. 실험 결과 일반적인 CNN 보다 배치 정규화가 추가된 CNN이 영상 분류 시 보다 높은 분류 정확도와 mAP를 보임을 확인 할 수 있었다.

Keywords

References

  1. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, May 2017. https://doi.org/10.1145/3065386
  2. Harada, Tatsuya, and Yasuo Kuniyoshi. "Graphical Gaussian vector for image categorization." Advances in Neural Information Processing Systems, pp. 1547-1555, 2012. https://dl.acm.org/citation.cfm?id=2999307
  3. C. Szegedy et al., "Going deeper with convolutions," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. https://doi.org/10.1109/cvpr.2012.6248018
  4. O. Russakovsky et al., "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, vol. 115, no. 3, pp. 211-252, Apr. 2015. https://doi.org/10.1007/s11263-015-0816-y
  5. M. D. Zeiler and R. Fergus, "Visualizing and Understanding Convolutional Networks," in Computer Vision - ECCV 2014, Springer International Publishing, pp. 818-833, 2014. https://doi.org/10.1007/978-3-319-10590-1_53
  6. C. Szegedy et al., "Going deeper with convolutions," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. https://doi.org/10.1109/cvpr.2015.7298594
  7. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. https://doi.org/10.1109/cvpr.2016.90
  8. G. Litjens et al., "A survey on deep learning in medical image analysis," Medical Image Analysis, vol. 42, pp. 60-88, Dec. 2017. https://doi.org/10.1016/j.media.2017.07.005
  9. Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, "DeepFace: Closing the Gap to Human-Level Performance in Face Verification," in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014. https://doi.org/10.1109/cvpr.2014.220
  10. Nair, V., & Hinton, G. E. "Rectified linear units improve restricted boltzmann machines." In Proceedings of the 27th international conference on machine learning, 2010. https://dl.acm.org/citation.cfm?id=3104425
  11. SRIVASTAVA, Nitish, et al. "Dropout: a simple way to prevent neural networks from overfitting." Journal of machine learning research, vol.15, pp. 1929-1958, Jun 2014. https://dl.acm.org/citation.cfm?id=2670313
  12. Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." International Conference on Machine Learning, vol.37, pp. 448-456, Jul 2015. https://dl.acm.org/citation.cfm?id=3045167
  13. GODIL, Afzal A., et al. SHREC'13 Track: Large Scale Sketch-Based 3D Shape Retrieval, pp. 89-96, 2013. http://dl.acm.org/citation.cfm?id=2601321
  14. Netzer, Yuval, et al. "Reading digits in natural images with unsupervised feature learning." NIPS workshop on deep learning and unsupervised feature learning. Vol. 2011, No. 2, pp. 5, 2011. ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf
  15. Krizhevsky, Alex, and Geoffrey Hinton. "Learning multiple layers of features from tiny images." 2009. https://www.cs.toronto.edu/-kriz/learning-features-2009-TR.pdf