DOI QR코드

DOI QR Code

CNN Accelerator Architecture using 3D-stacked RRAM Array

3차원 적층 구조 저항변화 메모리 어레이를 활용한 CNN 가속기 아키텍처

  • Won Joo Lee (Dept. of Electronical and Computer Engineering, University of Seoul) ;
  • Yoon Kim (Dept. of Electronical and Computer Engineering, University of Seoul) ;
  • Minsuk Koo (Dept. Computer Science and Engineering, Incheon National University)
  • 이원주 ;
  • 김윤 ;
  • 구민석
  • Received : 2024.06.25
  • Accepted : 2024.06.28
  • Published : 2024.06.30

Abstract

This paper presents a study on the integration of 3D-stacked dual-tip RRAM with a CNN accelerator architecture, leveraging its low drive current characteristics and scalability in a 3D stacked configuration. The dual-tip structure is utilized in a parallel connection format in a synaptic array to implement multi-level capabilities. It is configured within a Network-on-chip style accelerator along with various hardware blocks such as DAC, ADC, buffers, registers, and shift & add circuits, and simulations were performed for the CNN accelerator. The quantization of synaptic weights and activation functions was assumed to be 16-bit. Simulation results of CNN operations through a parallel pipeline for this accelerator architecture achieved an operational efficiency of approximately 370 GOPs/W, with accuracy degradation due to quantization kept within 3%.

본 논문은 낮은 구동 전류 특성과 3차원 적층 구조로 확장시킬 수 있는 장점을 가진 3차원 적층형 이중 팁 RRAM을 CNN 가속기 아키텍처에 접목하는 연구를 수행한 논문이다. 3차원 적층형 이중 팁을 적층 형태의 병렬연결로 시냅스 어레이에 사용하여 멀티-레벨을 구현하였다. 이를 Network-on-chip 형태의 가속기 내에 DAC, ADC, 버퍼 및 레지스터, shift & add 회로 등 다양한 하드웨어 블록들과 함께 구성하여 CNN 가속기에 대한 시뮬레이션을 수행하였다. 시냅스 가중치와 활성화 함수의 양자화는 16-bit으로 가정하였다. 해당 가속기 아키텍처를 위한 병렬 파이프라인을 통해 CNN 연산을 시뮬레이션한 결과, 연산효율은 약 370 GOPs/W를 달성하였으며, 양자화에 의한 정확도 열화는 3 % 이내가 되는 결과를 나타냈다.

Keywords

Acknowledgement

This work was supported by Incheon National University Research Grant in 2024.

References

  1. Rehman, Muhammad Muqeet, et al. "Decade of 2D-materials-based RRAM devices: a review," Science and technology of advanced materials, vol.21, no.1, pp.147-186, 2020. DOI: 10.1080/14686996.2020.1730236
  2. Lee, Won Joo, et al. "Three-Dimensional Resistive Random-Access Memory Based on Stacked DoubleTip Silicon Nanowires for Neuromorphic Systems," ACS Applied Electronic Materials, vol.6, no.4, pp. 2232-2241, 2024. DOI:10.1021/acsaelm.3c01680
  3. Shafiee, Ali, et al. "ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars," ACM SIGARCH Computer Architecture News, vol.44, no.3, pp.14-26, 2016. DOI: 10.1145/3007787.3001139
  4. L. Kull, et al. "A 3.1 mW 8b 1.2 GS/s SingleChannel Asynchronous SAR ADC with Alternate Comparators for Enhanced Speed in 32 nm Digital SOI CMOS," Journal of Solid-State Circuits, vol. 44, no.12, pp.3049-3058, 2013. DOI: 10.1109/JSSC.2013.2279571
  5. Y. Chen, et al., "DaDianNao: A MachineLearning Supercomputer," in Proceedings of 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014, pp.609-622. DOI: 10.1109/MICRO.2014.58