DOI QR코드

DOI QR Code

Deep compression of convolutional neural networks with low-rank approximation

  • Astrid, Marcella (Department of Computer Software, University of Science and Technology) ;
  • Lee, Seung-Ik (Department of Computer Software, University of Science and Technology)
  • Received : 2018.02.19
  • Accepted : 2018.05.03
  • Published : 2018.08.07

Abstract

The application of deep neural networks (DNNs) to connect the world with cyber physical systems (CPSs) has attracted much attention. However, DNNs require a large amount of memory and computational cost, which hinders their use in the relatively low-end smart devices that are widely used in CPSs. In this paper, we aim to determine whether DNNs can be efficiently deployed and operated in low-end smart devices. To do this, we develop a method to reduce the memory requirement of DNNs and increase the inference speed, while maintaining the performance (for example, accuracy) close to the original level. The parameters of DNNs are decomposed using a hybrid of canonical polyadic-singular value decomposition, approximated using a tensor power method, and fine-tuned by performing iterative one-shot hybrid fine-tuning to recover from a decreased accuracy. In this study, we evaluate our method on frequently used networks. We also present results from extensive experiments on the effects of several fine-tuning methods, the importance of iterative fine-tuning, and decomposition techniques. We demonstrate the effectiveness of the proposed method by deploying compressed networks in smartphones.

Keywords

References

  1. L. Zhang, Multi-view approach to specify and model aerospace cyber-physical systems, Int. Conf. Autom. Comput. (ICAC), London, UK, Sept. 13-14, 2013, pp. 1-6.
  2. S. A. Haque, S. M. Aziz, and M. Rahman, Review of cyber-physical system in healthcare, Int. J. Distrib. Sens. Netw. 10 (2014), no. 4, 217-415.
  3. L. Monostori et al., Cyber-physical systems in manufacturing, CIRP Ann.-Manuf. Technol. 65 (2016), no. 2, 621-641. https://doi.org/10.1016/j.cirp.2016.06.005
  4. K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014, arXiv preprint arXiv: 1409.1556.
  5. C. Szegedy et al., Going deeper with convolutions, IEEE Conf. Comput. Vision Pattern Recog. Boston, MA, USA, June 7-12, 2015, pp. 1-9.
  6. A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Adv. Neural Inform. Process. Syst., Lake Tahoe, NV, USA, Dec. 3-8, 2012, pp. 1097-1105.
  7. O. Russakovsky et al., ImageNet large scale visual recognition challenge, Int. J. Comput. Vision 115 (2015), no. 3, 211-252. https://doi.org/10.1007/s11263-015-0816-y
  8. J. Redmon and A. Farhadi, YOLO9000: Better, faster, stronger, IEEE Conf. Comput. Vision Pattern Recog., Honolulu, HI, USA, July 21-26, 2017, pp. 6517-6525.
  9. W. Liu et al., SSD: Single shot MultiBox detector, Eur. Conf. Comput. Vision, Amsterdam, Netherlands, Oct. 8-16, 2016, pp. 21-37.
  10. C. Chen et al., DeepDriving: Learning affordance for direct perception in autonomous driving, IEEE Int. Conf. Comput. Vision, Santiago, Chile, Dec. 7-13, 2015, pp. 2722-2730.
  11. M. Denil et al., Predicting parameters in deep learning, Proc. Int. Conf. Neural Inform. Process. Syst., Lake Tahoe, NV, USA, Dec. 5-10, 2013, pp. 2148-2156.
  12. Y.-D. Kim et al., Compression of deep convolutional neural networks for fast and low power mobile applications, 2015, arXiv preprint arXiv: 1511.06530.
  13. V. Lebedev et al., Speeding-up convolutional neural networks using fine-tuned CP-decomposition, 2014, arXiv preprint arXiv: 1412.6553.
  14. A. Paszke et al., Automatic differentiation in PyTorch, Conf. Neural Inform. Process. Syst., Long Beach, CA, USA, Dec. 2017, pp. 1-4.
  15. M. Abadi et al., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015, available at http://tensorflow.org.
  16. Y. Jia et al., Caffe: Convolutional architecture for fast feature embedding, Proc. ACM Int. Conf. Multimed., Orlando, FL, USA, Nov. 3-7, 2014, pp. 675-678.
  17. G. Golub and W. Kahan, Calculating the singular values and pseudoinverse of a matrix, J. Soc. Ind. Applicat. Math., Series B: Numer. Anal., 2 (1965), no. 2, 205-224. https://doi.org/10.1137/0702016
  18. G. Allen, Sparse higher-order principal components analysis, JISTATS, 15 (2012), 27-36.
  19. C. J. Hillar and L.-H. Lim, Most tensor problems are NP-hard, J. ACM, 60 (2013), no. 6, 1-39.
  20. M. Astrid and S.-I. Lee, CP-decomposition with tensor power method for convolutional neural networks compression, IEEE Int. Conf. Big Data Smart Comput., Jeju Island, Rep of Korea, Feb. 13-16, 2017, pp. 115-118.
  21. P. Wang and J. Cheng, Accelerating convolutional neural networks for mobile applications, Proc. ACM Multimed. Conf., vAmsterdam, Netherlands, Oct. 15-19, 2016, pp. 541-545.
  22. sh1r0, Caffe-android-lib, 2014. https://github.com/sh1r0/caffe-android-lib.
  23. V. De Silva and L.-H. Lim, Tensor rank and the ill-posedness of the best low-rank approximation problem, SIAM J. Matrix Anal. Applicat. 30 (2008), no. 3, 1084-1127. https://doi.org/10.1137/06066518X
  24. X. Zhang et al., Accelerating very deep convolutional networks for classification and detection, IEEE Trans. Pattern Anal. Mach. Intell. 38 (2016), no. 10, 1943-1955. https://doi.org/10.1109/TPAMI.2015.2502579
  25. S. Han et al., Learning both weights and connections for efficient neural network, Adv. Neural Inform. Process. Syst., Montreal, Canada, Dec. 7-12, 2015, pp. 1135-1143.
  26. M. Rastegari et al., XNOR-net: ImageNet classification using binary convolutional neural networks, Eur. Conf. Comput. Vision, Amsterdam, Netherlands, Oct. 8-16, 2016, pp. 525-542.
  27. S. Han, H. Mao, and W. J. Dally, Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding, Int. Conf. Learning Representations, San Diego, CA, USA, May 7-9, 2015.
  28. J. Yoon and S. J. Hwang, Combined group and exclusive sparsity for deep neural networks, Int. Conf. Mach. Learning, Sydney, Australia, Aug. 6-11, 2017, pp. 3958-3966.
  29. W. Chen et al., Compressing convolutional neural networks, 2015, arXiv preprint arXiv: 1506.04449.
  30. F. N. Iandola et al., SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 1 MB model size, 2016, arXiv preprint arXiv: 1602.07360.
  31. A. G. Howard et al., MobileNets: Efficient convolutional neural networks for mobile vision applications, 2017, arXiv preprint arXiv: 1704.04861.
  32. F. Chollet, Xception: Deep learning with depthwise separable convolutions, IEEE Conf. Comput. Vision Pattern Recog., Honolulu, HI, USA, July 21-26, 2017, pp. 1800-1807.