Efficient Thread Allocation Method of Convolutional Neural Network based on GPGPU

Kim, Mincheol;Lee, Kwangyeob;

doi:10.14257/ajmahs.2017.10.70

Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology (예술인문사회 융합 멀티미디어 논문지)

Volume 7 Issue 10
/
Pages.935-943
/
2017
/
2383-5281(pISSN)
/
2383-7268(eISSN)

The Convergent Research Society Among Humanities, Sociology, Science and Technology (인문사회과학기술융합학회)

DOI QR Code

Efficient Thread Allocation Method of Convolutional Neural Network based on GPGPU

GPGPU 기반 Convolutional Neural Network의 효율적인 스레드 할당 기법

Kim, Mincheol (Dept. of Computer Engineering, Seokyeong Univ.) ;
Lee, Kwangyeob (Dept. of Computer Engineering, Seokyeong Univ.)

김민철 ;
이광엽

Received : 2017.07.31
Accepted : 2017.08.22
Published : 2017.10.31

https://doi.org/10.14257/ajmahs.2017.10.70 Citation

⟨ Previous Next ⟩

Abstract

CNN (Convolution neural network), which is used for image classification and speech recognition among neural networks learning based on positive data, has been continuously developed to have a high performance structure to date. There are many difficulties to utilize in an embedded system with limited resources. Therefore, we use GPU (General-Purpose Computing on Graphics Processing Units), which is used for general-purpose operation of GPU to solve the problem because we use pre-learned weights but there are still limitations. Since CNN performs simple and iterative operations, the computation speed varies greatly depending on the thread allocation and utilization method in the Single Instruction Multiple Thread (SIMT) based GPGPU. To solve this problem, there is a thread that needs to be relaxed when performing Convolution and Pooling operations with threads. The remaining threads have increased the operation speed by using the method used in the following feature maps and kernel calculations.

많은 양의 데이터 기반으로 학습하는 neural network 중 이미지 분류나 음성 인식 등에 사용되어 지고 있는 CNN(Convolution neural network)는 현재까지도 우수한 성능을 가진 구조로 계속적으로 발전되고 있다. 제한된 자원을 가진 임베디드 시스템에서 활용하기에는 많은 어려움이 있다. 그래서 미리 학습된 가중치를 사용하지만 여전히 한계점이 있기 때문에 이를 해결하기 위해 GPU의 범용 연산을 위해서 사용하는 GP-GPU(General-Purpose computing on Graphics Processing Units)를 활용하는 추세다. CNN은 단순하고 반복적인 연산을 수행하기 때문에 SIMT(Single Instruction Multiple Thread)기반의 GPGPU에서 스레드 할당과 활용 방법에 따라 연산 속도가 많이 달라진다. 스레드로 Convolution 연산과 Pooling 연산을 수행할 때 쉬어야 하는 스레드가 발생하는 데 이러한 문제를 해결하기 위해 남은 스레드가 다음 피쳐맵과 커널 계산에 활용되는 방법을 사용함으로써 연산 속도를 증가시켰다.

Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology (예술인문사회 융합 멀티미디어 논문지)

Efficient Thread Allocation Method of Convolutional Neural Network based on GPGPU

GPGPU 기반 Convolutional Neural Network의 효율적인 스레드 할당 기법

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)