Browse > Article
http://dx.doi.org/10.5909/JBE.2020.25.2.200

Compression and Performance Evaluation of CNN Models on Embedded Board  

Moon, Hyeon-Cheol (School of Electronics and Information Engineering, Korea Aerospace University)
Lee, Ho-Young (School of Electronics and Information Engineering, Korea Aerospace University)
Kim, Jae-Gon (School of Electronics and Information Engineering, Korea Aerospace University)
Publication Information
Journal of Broadcast Engineering / v.25, no.2, 2020 , pp. 200-207 More about this Journal
Abstract
Recently, deep neural networks such as CNN are showing excellent performance in various fields such as image classification, object recognition, visual quality enhancement, etc. However, as the model size and computational complexity of deep learning models for most applications increases, it is hard to apply neural networks to IoT and mobile environments. Therefore, neural network compression algorithms for reducing the model size while keeping the performance have been being studied. In this paper, we apply few compression methods to CNN models and evaluate their performances in the embedded environment. For evaluate the performance, the classification performance and inference time of the original CNN models and the compressed CNN models on the image inputted by the camera are evaluated in the embedded board equipped with QCS605, which is a customized AI chip. In this paper, a few CNN models of MobileNetV2, ResNet50, and VGG-16 are compressed by applying the methods of pruning and matrix decomposition. The experimental results show that the compressed models give not only the model size reduction of 1.3~11.2 times at a classification performance loss of less than 2% compared to the original model, but also the inference time reduction of 1.2~2.21 times, and the memory reduction of 1.2~3.8 times in the embedded board.
Keywords
CNN; Neural Network Compression; Pruning; Matrix Decomposition; Embedded Board;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S., Han, et al, "Deep Compression: Compressing Deep Neural Networks with pruning, trained quantization and Huffman coding," In Proc. Computer Vision and Pattern Recognition (CVPR), Jun. 2015.
2 K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Leraning for Image Recognition," In Proc. Computer Vision and Pattern Recognition (CVPR), Jun. 2016.
3 A. Howard, M. Zhu, B. Chen, D. Kalenichenoko, W. Wang, T. Weyand, M. Andreetto, and H.Adam, "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," In Proc. Computer Vision and Pattern Recognition (CVPR), Jul. 2017.
4 X. Zhang, X. Zhou, M. Lin, and J. Sun, "ShuffleNet: An Exteremey Efficient Convolutional Neural Network for Mobile Devices," In Proc. Computer Vision and Patter Recognition (CVPR), Jun. 2018.
5 S. Jung, C. Son, S. Lee, J. Han, Y. Kwak, and S. Hwang, "Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss," In Proc. Computer Vision and Pattern Recognition (CVPR), Jun. 2019.
6 M. Jaderberg, A.Vedaldi, and A. Zisserman, "Speeding up Convolutional Neural Networks with Low Rank Exapnsions," In Proc. Computer Vision and Pattern Recognition (CVPR), Jun. 2014.
7 V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, "Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition," In Proc. Computer Vision and Patter Recognition (CVPR), Jun. 2015.
8 H. Moon, H. Lee, and J. Kim, "Acceleration of CNN Model Using Neural Network Compression and its Performance Evaluation on Embedded Boards," In Proc. KIBME Annual Fall Conf. Nov. 2019.
9 QCS605 Specification, https://www.qualcomm.com/products/qcs605 (accessed Jan. 6, 2020).
10 Tensorflow for Mobile & IoT, https://www.tensorflow.org/lite (accessed Jan. 6, 2020).
11 Large Scale Visual Recognition Challenge 2012 (ILSVRC 2012), http://www.image-net.org/challenges/LSVRC/2012/ (accessed Jan. 6, 2020).
12 H. Moon, J. Kim, S. Kim, S. Jang, and B. Choi, "KAU/KETI Response to the CE-1 on Neural Network Compression: CP Decomposition of Convolution Layers (Method5)," ISO/IEC JTC1/SC29/WG11 m52322, Jan. 2020.
13 Y. Luo, Y. Sho, Q. Huang, H. Hu, and L.Yu, "CE1 Report on Neural Network Compression of ZJU's Proposal," ISO/IEC JTC1/SC29/WG11 m50093, Oct. 2019