DOI QR코드

DOI QR Code

Parameter-Efficient Neural Networks Using Template Reuse

템플릿 재사용을 통한 패러미터 효율적 신경망 네트워크

  • 김대연 (인천대학교 임베디드시스템공학과) ;
  • 강우철 (인천대학교 임베디드시스템공학과)
  • Received : 2020.02.04
  • Accepted : 2020.03.30
  • Published : 2020.05.31

Abstract

Recently, deep neural networks (DNNs) have brought revolutions to many mobile and embedded devices by providing human-level machine intelligence for various applications. However, high inference accuracy of such DNNs comes at high computational costs, and, hence, there have been significant efforts to reduce computational overheads of DNNs either by compressing off-the-shelf models or by designing a new small footprint DNN architecture tailored to resource constrained devices. One notable recent paradigm in designing small footprint DNN models is sharing parameters in several layers. However, in previous approaches, the parameter-sharing techniques have been applied to large deep networks, such as ResNet, that are known to have high redundancy. In this paper, we propose a parameter-sharing method for already parameter-efficient small networks such as ShuffleNetV2. In our approach, small templates are combined with small layer-specific parameters to generate weights. Our experiment results on ImageNet and CIFAR100 datasets show that our approach can reduce the size of parameters by 15%-35% of ShuffleNetV2 while achieving smaller drops in accuracies compared to previous parameter-sharing and pruning approaches. We further show that the proposed approach is efficient in terms of latency and energy consumption on modern embedded devices.

최근 심층 신경망 (Deep Neural Networks, DNNs)는 모바일 및 임베디드 디바이스에 인간과 유사한 수준의 인공지능을 제공해 많은 응용에서 혁명을 가져왔다. 하지만, 이러한 DNN의 높은 추론 정확도는 큰 연산량을 요구하며, 따라서 기존의 사용되던 모델을 압축하거나 리소스가 제한적인 디바이스를 위해 작은 풋프린트를 가진 새로운 DNN 구조를 만드는 방법으로 DNN의 연산 오버헤드를 줄이기 위한 많은 노력들이 있어왔다. 이들 중 최근 작은 메모리 풋프린트를 갖는 모델 설계에서 주목받는 기법중 하나는 레이어 간에 패러미터를 공유하는 것이다. 하지만, 기존의 패러미터 공유 기법들은 ResNet과 같이 패러미터에 중복(redundancy)이 높은 것으로 알려진 깊은 심층 신경망에 적용되어왔다. 본 논문은 ShuffleNetV2와 같이 이미 패러미터 사용에 효율적인 구조를 갖는 소형 신경망에 적용할 수 있는 패러미터 공유 방법을 제안한다. 본 논문에서 제안하는 방법은 작은 크기의 템플릿과 레이어에 고유한 작은 패러미터를 결합하여 가중치를 생성한다. ImageNet과 CIFAR-100 데이터셋에 대한 우리의 실험 결과는 ShuffleNetV2의 패러미터를 15%-35% 감소시키면서도 기존의 패러미터 공유 방법과 pruning 방법에 대비 작은 정확도 감소만이 발생한다. 또한 우리는 제안된 방법이 최근의 임베디드 디바이스상에서 응답속도 및 에너지 소모량 측면에서 효율적임을 보여준다.

Keywords

References

  1. B. Hassibi and D. Stork, "Second order derivatives for network pruning: Optimal Brain Surgeon," Advances in Neural Information Processing Systems, pp. 164-171, 1993.
  2. S. Han, J. Pool, J. Tran, and W. Dally, "Learning both Weights and Connections for Efficient Neural Networks," Neural Information Processing Systems (NIPS), 2015.
  3. S. Anwar, K. Hwang, and W. Sung, “Structured Pruning of Deep Convolutional Neural Networks,” ACM Journal on Emerging Technologies in Computing Systems, Vol. 13, No. 3, pp. 1-18, 2015.
  4. H. Li and A. Kadav, "Pruning Filters for Efficient ConvNets," The International Conference on Learning Representations (ICLR), 2016.
  5. O. Kopuklu, M. Babaee, S. Hormann, and G. Rigoll, "Convolutional Neural Networks with Layer Reuse," 2019 IEEE International Conference on Image Processing (ICIP), pp. 345-349, 2019.
  6. Q. Guo, Z. Yu, Y. Wu, D. Liang, H. Qin, and J. Yan, "Dynamic Recursive Neural Network," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5142-5151, 2019.
  7. P. Savarese and M. Maire, "Learning Implicitly Recurrent CNNs Through Parameter Sharing," The International Conference on Learning Representations (ICLR), 2019.
  8. S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, "Aggregated Residual Transformations for Deep Neural Networks," 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5987-5995, 2017.
  9. F. Chollet, "Xception: Deep Learning with Depthwise Separable Convolutions," 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800-1807, 2017.
  10. N. Ma, X. Zhang, H. Zheng, and J. Sun, "ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design," The European Conference on Computer Vision (ECCV), 2018.
  11. W. Kang, D. Kim, and J. Park, "DMS: Dynamic Model Scaling for Quality-Aware Deep Learning Inference in Mobile and Embedded Devices," IEEE Access, Vol. 7, pp. 168048-168059, 2019. https://doi.org/10.1109/ACCESS.2019.2954546
  12. E. Crowley, J. Turner, A. Storkey, and M. O'Boyle, "Pruning neural networks: is it time to nip it in the bud?," NIPS 2018 Workshop on Compact Deep Neural Networks with Industrial Applications, 2018.
  13. X. Wang, F. Yu, Z. Dou, T. Darrell, and J. Gonzaelz, "SkipNet: Learning Dynamic Routing in Convolutional Networks," The European Conference on Computer Vision (ECCV), 2018.
  14. M. Figurnov, M. Collins, Y. Zhu, L. Zhang, J. Huang, D. Vetrov, and R. Salakhutdinov, "Spatially Adaptive Computation Time for Residual Networks," 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1790-1799, 2017.
  15. X. Zhang, X. Zhou, M. Lin, and J. Sun, "ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices," 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6848-6856, 2017.
  16. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4510-4520, 2017.
  17. G. Pereyra, G. Tucker, J. Chorowski, L. Kaiser, and G. Hinton, "Regularizing Neural Networks by Penalizing Confident Output Distributions," The International Conference on Learning Representations (ICLR), 2017.
  18. Y. He, J. Lin, Z. Liu, H. Wang, L. Li, and S. Han, "AMC: AutoML for Model Compression and Acceleration on Mobile Devices," ECCV, 2018.