Browse > Article
http://dx.doi.org/10.7471/ikeee.2020.24.1.225

Design and Implementation of Accelerator Architecture for Binary Weight Network on FPGA with Limited Resources  

Kim, Jong-Hyun (Telechips Inc.)
Yun, SangKyun (Department of Computer and Telecomm. Engineering, Yonsei University)
Publication Information
Journal of IKEEE / v.24, no.1, 2020 , pp. 225-231 More about this Journal
Abstract
In this paper, we propose a method to accelerate BWN based on FPGA with limited resources for embedded system. Because of the limited number of logic elements available, a single computing unit capable of handling Conv-layer, FC-layer of various sizes must be designed and reused. Also, if the input feature map can not be parallel processed at one time, the output must be calculated by reading the inputs several times. Since the number of available BRAM modules is limited, the number of data bits in the BWN accelerator must be minimized. The image classification processing time of the BWN accelerator is superior when compared with a embedded CPU and is faster than a desktop PC and 50% slower than a GPU system. Since the BWN accelerator uses a slow clock of 50MHz, it can be seen that the BWN accelerator is advantageous in performance versus power.
Keywords
Binary Weight Network; Low precision network; FPGA acceleration; SoC-FPGA; FPGA;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 C. Zhang and P. Li, "Optimizing Fpga-based accelerator design for deep convolutional neural networks," in FPGA'15, pp.161-170, 2015. DOI: 10.1145/2684746.2689060
2 J. Qiu and J. Wang, "Going deeper with embedded FPGA platform for convolutional neural network," in FPGA'16, pp.26-35, 2016. DOI: 10.1145/2847263.2847265
3 K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv:1409.1556, 2014.
4 Y. H. Chen and T. Krishna, "Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks," in 2016 IEEE Int. Solid-State Circuits Conf.(ISSCC), pp.262-263, 2016. DOI: 10.1109/JSSC.2016.2616357
5 K. He and X. Zhang. "Deep residual learning for image recognition," arXiv:1512.0338, 2015.
6 M. Courbariaux and Y. Bengio, "Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1,"CoRR. 2016.
7 S. Gupta, A. Agrawal, et.al, "Deep learning with limited numerical precision," arXiv:1502.02551, 2015.
8 D. Lin, S. Talathi, V. Annapureddy, "Fixed point quantization of deep convolutional networks," arXiv:1511.06393, 2016.
9 M. Courbariaux, Y. Bengio, and J.-P. David, "BinaryConnect: Training deep neural networks with binary weights during propagations," in Proc. Adv. Neural Inf. Process. Syst., pp.3123-3131, 2015.
10 I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, "Binarized neural networks," in Proc. Adv. Neural Inf. Process. Syst., pp.4107-4115, 2016.
11 M. Rastegari and V. Ordonez, "XNOR-Net: ImageNet classication using binary convolutional neural networks," In Proc. the European Conf. Computer Vision(ECCV'16), pp.525-542, 2016.
12 R. Zhao and W. Song, "Accelerating binarized convolutional neural networks with softwareprogrammable FPGAs," in FPGA'17, pp.15-24, 2017. DOI: 10.1145/3020078.3021741
13 Y. Umuroglu and N. J. Fraser, "FINN: a framework for fast, scalable binarized neural network inference," in FPGA'17, pp.65-74, 2017. DOI: 10.1145/3020078.3021744
14 S. Liang and S. Yin, "FP-BNN: Binarized neural network on FPGA," Neurocomputing, vol. 275, pp.1072-1086, 2018. DOI: 10.1016/j.neucom.2017.09.046   DOI
15 J. H. Kim and S. K. Yun, "Accuracy analysis of fixed point arithmetic for hardware implementation of binary weight network," Journal of IKEEE, Vol.22, No.3, 805-809, 2019. DOI: 10.7471/ikeee.2018.22.3.805   DOI
16 R. Andri and L. Cavigelli, "YodaNN: An ultra-low power convolutional neural network accelerator based on binary weights," in ISVLSI '16, pp.236-241, 2016. DOI: 10.1109/ISVLSI.2016.111
17 CIFAR-10 and CIFAR-100 datasets, https://www.cs.toronto.edu/-kriz/cifar.html