Browse > Article
http://dx.doi.org/10.17662/ksdim.2022.18.2.017

Power-Efficient DCNN Accelerator Mapping Convolutional Operation with 1-D PE Array  

Lee, Jeonghyeok (한양대학교 융합전자공학과)
Han, Sangwook (한양대학교 전자컴퓨터통신공학과)
Choi, Seungwon (한양대학교 전자컴퓨터통신공학과)
Publication Information
Journal of Korea Society of Digital Industry and Information Management / v.18, no.2, 2022 , pp. 17-26 More about this Journal
Abstract
In this paper, we propose a novel method of performing convolutional operations on a 2-D Processing Element(PE) array. The conventional method [1] of mapping the convolutional operation using the 2-D PE array lacks flexibility and provides low utilization of PEs. However, by mapping a convolutional operation from a 2-D PE array to a 1-D PE array, the proposed method can increase the number and utilization of active PEs. Consequently, the throughput of the proposed Deep Convolutional Neural Network(DCNN) accelerator can be increased significantly. Furthermore, the power consumption for the transmission of weights between PEs can be saved. Based on the simulation results, the performance of the proposed method provides approximately 4.55%, 13.7%, and 2.27% throughput gains for each of the convolutional layers of AlexNet, VGG16, and ResNet50 using the DCNN accelerator with a (weights size) x (output data size) 2-D PE array compared to the conventional method. Additionally the proposed method provides approximately 63.21%, 52.46%, and 39.23% power savings.
Keywords
FPGA; Deep Convolutional Neural Network; Accelerator; Processing Element; Data Reuse;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Tripathi, Milan. "Analysis of convolutional neural network based image classification techniques," Journal of Innovative Image Processing (JIIP), Vol.3, No.2, 2021, pp.100-117.   DOI
2 Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification with deep convolutional neural networks," In Advances in neural information processing systems, 2012, pp.1097-1105.
3 Chen, Y.-H., Emer, J., Sze, V. Eyeriss, A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks. ACM SIGARCH computer architecture news 2016, 44.3: pp.367-379.   DOI
4 A. Parashar, M. Rhu, A. Mukkara, A. Puglielli, R. Venkatesan, B. Khailany, J. Emer, S. W. Keckler, and W. J. Dally, SCNN: An accelerator for compressed-sparse convolutional neural networks, in International Symposium on Computer Architecture (ISCA), 2017.
5 Nvidia, NVDLA Open Source Project, 2017. http://nvdla.org/ 69, 76, 92, 94, 96, 97, 113, 114.
6 N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers et al., In-datacenter performance analysis of a tensor processing unit, in International Symposium on Computer Architecture (ISCA), 2017.
7 He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image recognition," In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp.770-778.
8 김진호, "딥러닝 신경망을 이용한 문자 및 단어 단위의 영문 차량 번호판 인식," 디지털산업정보학회 논문지, 제16권, 제4호, 2020, pp.19-28.
9 김진호.안흥섭.최승원, "CNN 기반의 IEEE 802.11 WLAN 프레임 포맷 검출," 디지털산업정보학회 논문지, 제16권, 제 2호, 2020, pp.27-33.
10 Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
11 T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning, in Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2014.
12 V. Sze, Y.-H.Chen, T.-J. Yang, J. S. Emer, "Efficient Processing of Deep Neural Networks," Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, 2020, pp.44-46.
13 Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen, Z. Xu, N. Sun, and O. Temam, DaDianNao: A machine-learning supercomputer, in International Symposium on Microarchitecture (MICRO), 2014.