DOI QR코드

DOI QR Code

Deep Learning을 위한 GPGPU 기반 Convolution 가속기 구현

An Implementation of a Convolutional Accelerator based on a GPGPU for a Deep Learning

  • 투고 : 2016.09.13
  • 심사 : 2016.09.29
  • 발행 : 2016.09.30

초록

본 논문에서는 GPGPU를 활용하여 Convolutional neural network의 가속화 방법을 제안한다. Convolutional neural network는 이미지의 특징 값을 학습하여 분류하는 neural network의 일종으로 대량의 데이터를 학습해야하는 영상 처리에 적합하다. 기존의 Convolutional neural network의 convolution layer는 다수의 곱셈 연산을 필요로 하여 임베디드 환경에서 실시간으로 동작하기에 어려움이 있다. 본 논문에서는 이러한 단점을 해결하기 위하여 winograd convolution 연산을 통하여 곱셈 연산을 줄이고 GPGPU의 SIMT 구조를 활용하여 convolution 연산을 병렬 처리한다. 실험은 ModelSim, TestDrive를 사용하여 진행하였고 실험 결과 기존의 convolution 연산보다 처리 시간이 약 17% 개선되었다.

In this paper, we propose a method to accelerate convolutional neural network by utilizing a GPGPU. Convolutional neural network is a sort of the neural network learning features of images. Convolutional neural network is suitable for the image processing required to learn a lot of data such as images. The convolutional layer of the conventional CNN required a large number of multiplications and it is difficult to operate in the real-time on the embedded environment. In this paper, we reduce the number of multiplications through Winograd convolution operation and perform parallel processing of the convolution by utilizing SIMT-based GPGPU. The experiment was conducted using ModelSim and TestDrive, and the experimental results showed that the processing time was improved by about 17%, compared to the conventional convolution.

키워드

참고문헌

  1. David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, et al., "Mastering the game of Go with deep neural networks and tree search," . Nature, 529, pp. 484-489, 28 January 2016. https://doi.org/10.1038/nature16961
  2. http://smart.science.go.kr/scienceSubject/iot/view.action?menuCd=DOM_000000101001012000&subj ect_sid=1322
  3. Shmuel Winograd, " Arithmetic complexity of computations," volume 33. Siam, 1980.
  4. Lavin, Andrew. "Fast algorithms for convolutional neural networks." arXiv preprint arXiv:1509.09308 2015.
  5. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
  6. Agarwal, R., and J. Cooley. "New algorithms for digital convolution." IEEE Transactions on Acoustics, Speech, and Signal Processing 25.5 (1977): 392-410. https://doi.org/10.1109/TASSP.1977.1162981
  7. Yunseop Hwang, Kwang yeob Lee, Junmo Jeong, "Design of SIMT Architecture-based Reconfigurable Image Signal Processor," International conference on future information & communication engineering, 25 June 2015.
  8. https://sourceforge.net/projects/test-drive/

피인용 문헌

  1. The training of convolution neural network for advanced driver assistant system vol.4, pp.4, 2016, https://doi.org/10.17703/IJACT2016.4.4.23