Neural Network Model Compression Algorithms for Image Classification in Embedded Systems

Shin, Heejung;Oh, Hyondong;

doi:10.7746/jkros.2022.17.2.133

로봇학회논문지 (The Journal of Korea Robotics Society)

제17권2호
/
Pages.133-141
/
2022
/
1975-6291(pISSN)
/
2287-3961(eISSN)

한국로봇학회 (Korea Robotics Society)

DOI QR Code

임베디드 시스템에서의 객체 분류를 위한 인공 신경망 경량화 연구

Neural Network Model Compression Algorithms for Image Classification in Embedded Systems

신희중 ;
오현동

Shin, Heejung (UNIST) ;
Oh, Hyondong (Mechanical Engineering, UNIST)

투고 : 2022.03.07
심사 : 2022.03.25
발행 : 2022.05.31

https://doi.org/10.7746/jkros.2022.17.2.133 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

This paper introduces model compression algorithms which make a deep neural network smaller and faster for embedded systems. The model compression algorithms can be largely categorized into pruning, quantization and knowledge distillation. In this study, gradual pruning, quantization aware training, and knowledge distillation which learns the activation boundary in the hidden layer of the teacher neural network are integrated. As a large deep neural network is compressed and accelerated by these algorithms, embedded computing boards can run the deep neural network much faster with less memory usage while preserving the reasonable accuracy. To evaluate the performance of the compressed neural networks, we evaluate the size, latency and accuracy of the deep neural network, DenseNet201, for image classification with CIFAR-10 dataset on the NVIDIA Jetson Xavier.

키워드

과제정보

This work was supported by Theater Defense Research Center funded by Defense Acquisition Program Administration under Grant UD200043CD

참고문헌

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, May, 2017, DOI: 10.1145/3065386.
J. Deng, W. Dong, R. Socher, L.-J. Li, Kai Li, and Li Fei-Fei, "ImageNet: A Large-scale Hierarchical Image Database," 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009, DOI: 10.1109/CVPR.2009.5206848.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, DOI: 10.1109/CVPR.2016.90.
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, "An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale," International Conference on Learning Representations, Vienna, Austria, 2021. [Online], https://openreview.net/forum?id=YicbFdNTTy.
S. Oh, H. Kim, S. Cho, J. You, Y. Kwon, W.-S. Ra, and Y.-K. Kim, "Development of a Compressed Deep Neural Network for Detecting Defected Electrical Substation Insulators Using a Drone," Journal of Institute of Control, Robotics and Systems, vol. 26, no. 11, pp. 884-890, Nov., 2020, DOI: 10.5302/j.icros.2020.20.0117.
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv preprint arXiv:1704.04861, 2017, [Online], https://arxiv.org/abs/1704.04861.
T. Uetsuki, Y. Okuyama, and J. Shin, "CNN-based End-to-end Autonomous Driving on FPGA Using TVM and VTA," 2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), Singapore, 2021, DOI: 10.1109/MCSoC51149.2021.00028..
S. Han, H. Mao, and W. J. Dally, "Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding," arXiv preprint arXiv:1510. 00149, 2015, [Online], https://arxiv.org/abs/1510.00149.
Y. LeCun, D. John, and S. Sara, "Optimal Brain Damage," Advances in Neural Information Processing Systems, 1989, [Online], https://papers.nips.cc/paper/1989/hash/6c9882bbac1c7093bd25041881277658-Abstract.html.
B. Hassibi and D. Stork, "Second Order Derivatives for Network Pruning: Optimal Brain Surgeon," Advances in Neural Information Processing Systems, 1992, [Online], https://proceedings.neurips.cc/paper/1992/hash/303ed4c69846ab36c2904d3ba8573050-Abstract.html.
H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius, "Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation," arXiv preprint arXiv:2004.09602. 2020, [Online], https://arxiv.org/abs/2004.09602.
G. E. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," Advances in Neural Information Processing Systems, 2014, [Online], https://arxiv.org/abs/1503.02531.
M. Zhu and S. Gupta, "To prune, or not to Prune: Exploring the Efficacy of Pruning for Model Compression," International Conference on Learning Representations Workshop, 2018, [Online], https://arxiv.org/abs/1710.01878.
B. Heo, M. Lee, S. Yun, and J. Choi, "Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons," AAAI Conference on Artificial Intelligence, 2019, DOI: 10.1609/aaai.v33i01.33013779.
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based Learning Applied to Document Recognition," IEEE, vol. 86, no. 11, pp. 2278-2324, Nov., 1998, DOI: 10.1109/5.726791.
K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large Scale Image Recognition," International Conference on Learning Representations, 2015, [Online], https://arxiv.org/abs/1409.1556.
G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely Connected Convolutional Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, DOI: 10.1109/CVPR.2017.243.
S. Han, J. Pool, J. Tran, and W. J. Dally, "Learning Both Weights and Connections for Efficient Neural Networks," arXiv:1506.02626, 2015, [Online], https://arxiv.org/abs/1506.02626.
A. See, M.-T. Luong, and C. D. Manning, "Compression of Neural Machine Translation via Pruning," The 20th Signll Conference on Computational Natural Language Learning, Aug., 2016, DOI: 10.18653/v1/K16-1029.
S. Narang, E. Elsen, G. Diamos, and S. Sengupta, "Exploring Sparsity in Recurrent Neural Networks," arXiv preprint arXiv: 1704.05119, 2017, [Online], https://arxiv.org/abs/1704.05119.
A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, "Fitnets: Hints for Thin Deep Nets," arXiv:1412.6550, 2015, [Online], https://arxiv.org/abs/1412.6550.
S. Lim, I. Kim, T. Kim, C. Kim, and S. Kim, "Fast Auto-augment," Advances in Neural Information Processing Systems, vol. 32, 2019, [Online], https://papers.nips.cc/paper/2019/hash/6add07cf50424b14fdf649da87843d01-Abstract.html.

로봇학회논문지 (The Journal of Korea Robotics Society)

임베디드 시스템에서의 객체 분류를 위한 인공 신경망 경량화 연구

Neural Network Model Compression Algorithms for Image Classification in Embedded Systems

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)