Browse > Article
http://dx.doi.org/10.6109/jkiice.2018.22.1.1

Efficient Fixed-Point Representation for ResNet-50 Convolutional Neural Network  

Kang, Hyeong-Ju (School of Computer Science and Engineering, Korea University of Technology and Education)
Abstract
Recently, the convolutional neural network shows high performance in many computer vision tasks. However, convolutional neural networks require enormous amount of operation, so it is difficult to adopt them in the embedded environments. To solve this problem, many studies are performed on the ASIC or FPGA implementation, where an efficient representation method is required. The fixed-point representation is adequate for the ASIC or FPGA implementation but causes a performance degradation. This paper proposes a separate optimization of representations for the convolutional layers and the batch normalization layers. With the proposed method, the required bit width for the convolutional layers is reduced from 16 bits to 10 bits for the ResNet-50 neural network. Since the computation amount of the convolutional layers occupies the most of the entire computation, the bit width reduction in the convolutional layers enables the efficient implementation of the convolutional neural networks.
Keywords
Convolutional neural network; Neural network accelerator; ASIC implementation; Fixed-point representation;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 S. Liu, Z. Du, J. Tao, D. Han, T. Luo, Y. Zie, Y. Chen, and T. Chen, "Cambricon: An instruction set architecture for neural networks," in Proceedings of International Symposium on Computer Architecture, pp. 393-405, 2016.
2 L. Song, Y. Wang, Y. Han, X. Zhao, B. Liu, and X. Li, "C-Brain: A deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization," in Proceedings of Design Automation Conference, pp. 123:1-123:6, 2016.
3 Y. Wang, H. Li, and X. Li, "Re-architecting the on-chip memory sub-system of machine learning accelerator for embedded devices," in Proceedings of International Conference on Computer Aided Design, pp. 13:1-13:6, 2016.
4 J. Qiu, J. Wang, S. Yao, K. Guo, B. Li, E. Zhou, J. Yu, T. Tang, N. Xu, S. Song, Y. Wang, and H. Yang, "Going deeper with embedded FPGA platform for convolutional neural network," in Proceedings of ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 26-35, 2016.
5 S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, "Deep learning with limited numerical precision," in Proceedings of International Conference on Machine Learning, pp. 1737-1746, 2015.
6 D. D. Lin, S. S. Talathi, and V. S. Annapureddy. Fixed point quantization of deep convolutional networks [Internet]. Available: http://arxiv.org/abs/1511.06393.
7 W. Sung, S. Shin, and K. Hwang. Resiliency of deep neural networks under quantization [Internet]. Available: http://arxiv.org/abs/1511.06488.
8 M. Courbariaux, "Training deep neural networks with low precision multiplications," in Proceedings of International Conference on Learning Representations, 2015.
9 P. Judd, J. Algericio, T. Hetherington, T. Aamodt, N. E. Jerger, R. Urtasun, and A. Moshovos. Reduced-precision strategies for bounded memory in deep neural nets [Internet]. Available: http://arxiv.org/abs/1511.05236.
10 P. Gysel, M. Motamedi, and S. Ghiasi. Hardware-oriented approximation of convolutional neural networks [Internet]. Available: http://arxiv.org/abs/1604.03168.
11 D. Miyashita, E. H. Lee, and B. Murmann. Convolutional neural networks using loogarithmic data representation [Internet]. Available: http://arxiv.org/abs/1603.01025.
12 Z. Deng, C. Xu, Q. Cai, and P. Faraboschi, "Reducedprecision memory value approximation for deep learning," HP Laboratories, Tech. Rep. HPL-2015-100, 2015.
13 S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding [Internet]. Available: http://arxiv.org/abs/1510.00149.
14 L. Lai, N. Suda, and V. Chandra. Deep convolutional neural network inference with floating-point weights and fixed-point activations [Internet]. Available: http://arxiv. org/abs/1703.03073.
15 Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Gudarrama, and T. Darrel. Caffe: Convolutional Architecture for Fast Feature Embedding [Internet]. Available: http://arxiv.org/abs/1408.5093.
16 K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Networks [Internet]. Available: https://github.com/KaimingHe/deep-residual-networks.
17 J. Deng, W. Dong, R. Socher, J.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A large-scale hierarchical image database," in Proceedings of Computer Vision and Pattern Recognition, pp. 248-255, 2009.
18 C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions [Internet]. Available: http://arxiv.org/abs/1409.4842.
19 A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," in Proceedings of Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.
20 K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition [Internet]. Available: http://arxiv.org/abs/1409.1556.
21 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of Computer Vision and Pattern Recognition, pp. 770-778, 2016.
22 S.-H. Kwon, K.-W. Park, B.-H. Chang, "A comparison of predicting movie success between artificial neural network and decision Tree," Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology, vol.7, no.4, pp. 593-601, Apr. 2017.
23 Y.-J. Kim, E.-G. Kim, "Image based Fire Detection using Convolutional Neural Network," Journal of the Korea Institute of Information and Communication Engineering, vol. 20, no. 9, pp. 1649-1656, Sep. 2016.   DOI
24 Y.-H. Chen, J. Emer, and V. Sze, "Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks," in Proceedings of International Symposium on Computer Architecture, pp. 367-379, 2016.
25 T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, "DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning," in Proceedings of International Conference on Architectureal Suport for Programming Languages and Operating Systems, pp. 269-283, 2014.