DOI QR코드

DOI QR Code

Low-area DNN Core using data reuse technique

데이터 재사용 기법을 이용한 저 면적 DNN Core

  • Jo, Cheol-Won (Dept. of Electronic and Computer Eng., Seokyeong University) ;
  • Lee, Kwang-Yeob (Dept. of Electronic and Computer Eng., Seokyeong University) ;
  • Kim, Chi-Yong (Dept. of Software, Seokyeong University)
  • Received : 2021.01.13
  • Accepted : 2021.03.22
  • Published : 2021.03.31

Abstract

NPU in an embedded environment performs deep learning algorithms with few hardware resources. By using a technique that reuses data, deep learning algorithms can be efficiently computed with fewer resources. In previous studies, data is reused using a shifter in ScratchPad for data reuse. However, as the ScratchPad's bandwidth increases, the shifter also consumes a lot of resources. Therefore, we present a data reuse technique using the Buffer Round Robin method. By using the Buffer Round Robin method presented in this paper, the chip area could be reduced by about 4.7% compared to the conventional method.

임베디드 환경에서의 NPU는 적은 하드웨어 자원으로 딥러닝 알고리즘을 수행한다. 데이터를 재사용하는 기법을 활용하면 적은 자원으로 딥러닝 알고리즘을 효율적으로 연산할 수 있다. 선행연구에서는 데이터 재사용을 위해 ScratchPad에서 shifter를 사용해 데이터를 재사용한다. 하지만 ScratchPad의 Bandwidth가 커짐에 따라 shifter 역시 많은 자원을 소모한다. 따라서 Buffer Round Robin방식을 사용한 데이터 재사용 기법을 제시한다. 본 논문에서 제시하는 Buffer Round Robin 방식을 사용하여 기존의 방식보다 약 4.7%의 Chip Area를 줄일 수 있었다.

Keywords

References

  1. Chen, Yu-Hsin, et al. "Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks," IEEE journal of solid-state circuits Vol.52, No.1, pp.127-138, 2016. DOI: 10.1109/JSSC.2016.2616357
  2. Alwani, Manoj, et al. "Fused-layer CNN accelerators," 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 2016. DOI: 10.1109/MICRO.2016.7783725
  3. Cheol-Won Jo, Kwang-Yeob Lee, and Ki-Hun Nam. "Implementation of low power BSPE Core for deep learning hardware accelerators," Journal of IKEEE Vol.24, No.3, pp.895-900, 2020. DOI: 10.7471/ikeee.2020.24.3.895
  4. Cheol-Won Jo, Kwang-Yeob Lee, "Bit-Serial multiplier based Neural Processing Element with Approximate adder tree," International SoC Design Conference(ISOCC), 2020. DOI: 10.1109/ISOCC50952.2020.9332993
  5. Chen, Tianshi, et al. "Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning," ACM SIGARCH Computer Architecture News, Vol.42, No.1, pp.269-284, 2014. DOI: 10.1145/2541940.2541967