Browse > Article

Fast and Efficient Implementation of Neural Networks using CUDA and OpenMP  

Park, An-Jin (숭실대학교 미디어학과)
Jang, Hong-Hoon (숭실대학교 미디어학부)
Jung, Kee-Chul (숭실대학교 미디어학부)
Abstract
Many algorithms for computer vision and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation has two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job that needs much cooperation between CPU and GPU, which is usual in image processing and pattern recognition contrary to the graphic area, CPU should generate raw feature data for GPU processing as much as possible to effectively utilize GPU performance. This paper proposes more quick and efficient implementation of neural networks on both GPU and multi-core CPU. We use CUDA (compute unified device architecture) that can be easily programmed due to its simple C language-like style instead of GPU to solve the first problem. Moreover, OpenMP (Open Multi-Processing) is used to concurrently process multiple data with single instruction on multi-core CPU, which results in effectively utilizing the memories of GPU. In the experiments, we implemented neural networks-based text extraction system using the proposed architecture, and the computational times showed about 15 times faster than implementation on only GPU without OpenMP.
Keywords
CUDA; OpenMP; Neural Network; Text Extraction;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 K.S. Oh and K. Jung, 'GPU Implementation of Neural Network,' Pattern Recognition, Vol.37, Issue 6, pp. 1311-1314, 2004   DOI   ScienceOn
2 http://www.opengl.org/documentation/glsl/
3 http://graphics.stanford.edu/projects/brookgpu/
4 K. Jung, 'Neural Network-based Text Localization in Color Images,' Pattern Recognition Letters, Vol.22, Issue 4, pp. 1503-1515, 2001   DOI   ScienceOn
5 J. Mairal, R. Keriven, and A. Chariot, 'Fast and Efficient Dense Variational Stereo on GPU,' Proceedings of International Symposium on 3D Data Processing, Visualization, and Transmission, pp. 697-704, 2006
6 http://developer.nvidia.com/object/cg_toolkit.html/
7 K. Moreland and E. Angel, 'The FFT on a GPU,' Proceedings of SIGGRAPH Conference on Graphics Hardware, pp. 112-119, 2003
8 http://ati.amd.com/developer/
9 I. Geys and L.V. Gool, 'View Synthesis by the Parallel Use of GPU and CPU,' Image and Vision Computing, Vol.25, Issue 7, pp. 1154-1164, 2007   DOI   ScienceOn
10 http://www.openmp.org/
11 http://www.nvidia.com/object/cuda_home.html/
12 R. Yang and G. Welch, 'Fast Image Segmentation and Smoothing using Commodity Graphics Hardware,' Journal of Graphics Tools, Vol.17, Issue 4, pp. 91-100, 2002