Browse > Article
http://dx.doi.org/10.7746/jkros.2022.17.2.133

Neural Network Model Compression Algorithms for Image Classification in Embedded Systems  

Shin, Heejung (UNIST)
Oh, Hyondong (Mechanical Engineering, UNIST)
Publication Information
The Journal of Korea Robotics Society / v.17, no.2, 2022 , pp. 133-141 More about this Journal
Abstract
This paper introduces model compression algorithms which make a deep neural network smaller and faster for embedded systems. The model compression algorithms can be largely categorized into pruning, quantization and knowledge distillation. In this study, gradual pruning, quantization aware training, and knowledge distillation which learns the activation boundary in the hidden layer of the teacher neural network are integrated. As a large deep neural network is compressed and accelerated by these algorithms, embedded computing boards can run the deep neural network much faster with less memory usage while preserving the reasonable accuracy. To evaluate the performance of the compressed neural networks, we evaluate the size, latency and accuracy of the deep neural network, DenseNet201, for image classification with CIFAR-10 dataset on the NVIDIA Jetson Xavier.
Keywords
Deep Learning; Model Compression; Pruning; Quantization; Knowledge Distillation; Embedded System;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. Han, H. Mao, and W. J. Dally, "Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding," arXiv preprint arXiv:1510. 00149, 2015, [Online], https://arxiv.org/abs/1510.00149.
2 J. Deng, W. Dong, R. Socher, L.-J. Li, Kai Li, and Li Fei-Fei, "ImageNet: A Large-scale Hierarchical Image Database," 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009, DOI: 10.1109/CVPR.2009.5206848.   DOI
3 K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, DOI: 10.1109/CVPR.2016.90.   DOI
4 S. Oh, H. Kim, S. Cho, J. You, Y. Kwon, W.-S. Ra, and Y.-K. Kim, "Development of a Compressed Deep Neural Network for Detecting Defected Electrical Substation Insulators Using a Drone," Journal of Institute of Control, Robotics and Systems, vol. 26, no. 11, pp. 884-890, Nov., 2020, DOI: 10.5302/j.icros.2020.20.0117.   DOI
5 A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv preprint arXiv:1704.04861, 2017, [Online], https://arxiv.org/abs/1704.04861.
6 Y. LeCun, D. John, and S. Sara, "Optimal Brain Damage," Advances in Neural Information Processing Systems, 1989, [Online], https://papers.nips.cc/paper/1989/hash/6c9882bbac1c7093bd25041881277658-Abstract.html.
7 H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius, "Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation," arXiv preprint arXiv:2004.09602. 2020, [Online], https://arxiv.org/abs/2004.09602.
8 G. E. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," Advances in Neural Information Processing Systems, 2014, [Online], https://arxiv.org/abs/1503.02531.
9 B. Heo, M. Lee, S. Yun, and J. Choi, "Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons," AAAI Conference on Artificial Intelligence, 2019, DOI: 10.1609/aaai.v33i01.33013779.   DOI
10 G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, DOI: 10.1109/CVPR.2017.243.   DOI
11 Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based Learning Applied to Document Recognition," IEEE, vol. 86, no. 11, pp. 2278-2324, Nov., 1998, DOI: 10.1109/5.726791.   DOI
12 S. Han, J. Pool, J. Tran, and W. J. Dally, "Learning Both Weights and Connections for Efficient Neural Networks," arXiv:1506.02626, 2015, [Online], https://arxiv.org/abs/1506.02626.
13 S. Narang, E. Elsen, G. Diamos, and S. Sengupta, "Exploring Sparsity in Recurrent Neural Networks," arXiv preprint arXiv: 1704.05119, 2017, [Online], https://arxiv.org/abs/1704.05119.
14 A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, "Fitnets: Hints for Thin Deep Nets," arXiv:1412.6550, 2015, [Online], https://arxiv.org/abs/1412.6550.
15 B. Hassibi and D. Stork, "Second Order Derivatives for Network Pruning: Optimal Brain Surgeon," Advances in Neural Information Processing Systems, 1992, [Online], https://proceedings.neurips.cc/paper/1992/hash/303ed4c69846ab36c2904d3ba8573050-Abstract.html.
16 A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, May, 2017, DOI: 10.1145/3065386.   DOI
17 A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, "An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale," International Conference on Learning Representations, Vienna, Austria, 2021. [Online], https://openreview.net/forum?id=YicbFdNTTy.
18 T. Uetsuki, Y. Okuyama, and J. Shin, "CNN-based End-to-end Autonomous Driving on FPGA Using TVM and VTA," 2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), Singapore, 2021, DOI: 10.1109/MCSoC51149.2021.00028..   DOI
19 K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large Scale Image Recognition," International Conference on Learning Representations, 2015, [Online], https://arxiv.org/abs/1409.1556.
20 M. Zhu and S. Gupta, "To prune, or not to Prune: Exploring the Efficacy of Pruning for Model Compression," International Conference on Learning Representations Workshop, 2018, [Online], https://arxiv.org/abs/1710.01878.
21 A. See, M.-T. Luong, and C. D. Manning, "Compression of Neural Machine Translation via Pruning," The 20th Signll Conference on Computational Natural Language Learning, Aug., 2016, DOI: 10.18653/v1/K16-1029.   DOI
22 S. Lim, I. Kim, T. Kim, C. Kim, and S. Kim, "Fast Auto-augment," Advances in Neural Information Processing Systems, vol. 32, 2019, [Online], https://papers.nips.cc/paper/2019/hash/6add07cf50424b14fdf649da87843d01-Abstract.html.