Browse > Article
http://dx.doi.org/10.4218/etrij.2018-0065

Deep compression of convolutional neural networks with low-rank approximation  

Astrid, Marcella (Department of Computer Software, University of Science and Technology)
Lee, Seung-Ik (Department of Computer Software, University of Science and Technology)
Publication Information
ETRI Journal / v.40, no.4, 2018 , pp. 421-434 More about this Journal
Abstract
The application of deep neural networks (DNNs) to connect the world with cyber physical systems (CPSs) has attracted much attention. However, DNNs require a large amount of memory and computational cost, which hinders their use in the relatively low-end smart devices that are widely used in CPSs. In this paper, we aim to determine whether DNNs can be efficiently deployed and operated in low-end smart devices. To do this, we develop a method to reduce the memory requirement of DNNs and increase the inference speed, while maintaining the performance (for example, accuracy) close to the original level. The parameters of DNNs are decomposed using a hybrid of canonical polyadic-singular value decomposition, approximated using a tensor power method, and fine-tuned by performing iterative one-shot hybrid fine-tuning to recover from a decreased accuracy. In this study, we evaluate our method on frequently used networks. We also present results from extensive experiments on the effects of several fine-tuning methods, the importance of iterative fine-tuning, and decomposition techniques. We demonstrate the effectiveness of the proposed method by deploying compressed networks in smartphones.
Keywords
convolutional neural network; CP-decomposition; cyber physical system; model compression; singular value decomposition; tensor power method;
Citations & Related Records
연도 인용수 순위
  • Reference
1 L. Monostori et al., Cyber-physical systems in manufacturing, CIRP Ann.-Manuf. Technol. 65 (2016), no. 2, 621-641.   DOI
2 L. Zhang, Multi-view approach to specify and model aerospace cyber-physical systems, Int. Conf. Autom. Comput. (ICAC), London, UK, Sept. 13-14, 2013, pp. 1-6.
3 S. A. Haque, S. M. Aziz, and M. Rahman, Review of cyber-physical system in healthcare, Int. J. Distrib. Sens. Netw. 10 (2014), no. 4, 217-415.
4 K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014, arXiv preprint arXiv: 1409.1556.
5 C. Szegedy et al., Going deeper with convolutions, IEEE Conf. Comput. Vision Pattern Recog. Boston, MA, USA, June 7-12, 2015, pp. 1-9.
6 A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Adv. Neural Inform. Process. Syst., Lake Tahoe, NV, USA, Dec. 3-8, 2012, pp. 1097-1105.
7 O. Russakovsky et al., ImageNet large scale visual recognition challenge, Int. J. Comput. Vision 115 (2015), no. 3, 211-252.   DOI
8 M. Astrid and S.-I. Lee, CP-decomposition with tensor power method for convolutional neural networks compression, IEEE Int. Conf. Big Data Smart Comput., Jeju Island, Rep of Korea, Feb. 13-16, 2017, pp. 115-118.
9 P. Wang and J. Cheng, Accelerating convolutional neural networks for mobile applications, Proc. ACM Multimed. Conf., vAmsterdam, Netherlands, Oct. 15-19, 2016, pp. 541-545.
10 sh1r0, Caffe-android-lib, 2014. https://github.com/sh1r0/caffe-android-lib.
11 V. De Silva and L.-H. Lim, Tensor rank and the ill-posedness of the best low-rank approximation problem, SIAM J. Matrix Anal. Applicat. 30 (2008), no. 3, 1084-1127.   DOI
12 X. Zhang et al., Accelerating very deep convolutional networks for classification and detection, IEEE Trans. Pattern Anal. Mach. Intell. 38 (2016), no. 10, 1943-1955.   DOI
13 Y.-D. Kim et al., Compression of deep convolutional neural networks for fast and low power mobile applications, 2015, arXiv preprint arXiv: 1511.06530.
14 W. Liu et al., SSD: Single shot MultiBox detector, Eur. Conf. Comput. Vision, Amsterdam, Netherlands, Oct. 8-16, 2016, pp. 21-37.
15 C. Chen et al., DeepDriving: Learning affordance for direct perception in autonomous driving, IEEE Int. Conf. Comput. Vision, Santiago, Chile, Dec. 7-13, 2015, pp. 2722-2730.
16 M. Denil et al., Predicting parameters in deep learning, Proc. Int. Conf. Neural Inform. Process. Syst., Lake Tahoe, NV, USA, Dec. 5-10, 2013, pp. 2148-2156.
17 J. Yoon and S. J. Hwang, Combined group and exclusive sparsity for deep neural networks, Int. Conf. Mach. Learning, Sydney, Australia, Aug. 6-11, 2017, pp. 3958-3966.
18 S. Han et al., Learning both weights and connections for efficient neural network, Adv. Neural Inform. Process. Syst., Montreal, Canada, Dec. 7-12, 2015, pp. 1135-1143.
19 M. Rastegari et al., XNOR-net: ImageNet classification using binary convolutional neural networks, Eur. Conf. Comput. Vision, Amsterdam, Netherlands, Oct. 8-16, 2016, pp. 525-542.
20 S. Han, H. Mao, and W. J. Dally, Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding, Int. Conf. Learning Representations, San Diego, CA, USA, May 7-9, 2015.
21 W. Chen et al., Compressing convolutional neural networks, 2015, arXiv preprint arXiv: 1506.04449.
22 F. N. Iandola et al., SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 1 MB model size, 2016, arXiv preprint arXiv: 1602.07360.
23 Y. Jia et al., Caffe: Convolutional architecture for fast feature embedding, Proc. ACM Int. Conf. Multimed., Orlando, FL, USA, Nov. 3-7, 2014, pp. 675-678.
24 V. Lebedev et al., Speeding-up convolutional neural networks using fine-tuned CP-decomposition, 2014, arXiv preprint arXiv: 1412.6553.
25 A. Paszke et al., Automatic differentiation in PyTorch, Conf. Neural Inform. Process. Syst., Long Beach, CA, USA, Dec. 2017, pp. 1-4.
26 M. Abadi et al., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015, available at http://tensorflow.org.
27 J. Redmon and A. Farhadi, YOLO9000: Better, faster, stronger, IEEE Conf. Comput. Vision Pattern Recog., Honolulu, HI, USA, July 21-26, 2017, pp. 6517-6525.
28 G. Golub and W. Kahan, Calculating the singular values and pseudoinverse of a matrix, J. Soc. Ind. Applicat. Math., Series B: Numer. Anal., 2 (1965), no. 2, 205-224.   DOI
29 G. Allen, Sparse higher-order principal components analysis, JISTATS, 15 (2012), 27-36.
30 C. J. Hillar and L.-H. Lim, Most tensor problems are NP-hard, J. ACM, 60 (2013), no. 6, 1-39.
31 F. Chollet, Xception: Deep learning with depthwise separable convolutions, IEEE Conf. Comput. Vision Pattern Recog., Honolulu, HI, USA, July 21-26, 2017, pp. 1800-1807.
32 A. G. Howard et al., MobileNets: Efficient convolutional neural networks for mobile vision applications, 2017, arXiv preprint arXiv: 1704.04861.