[KSCI] Korea Science Citation Index Service

Fast and Efficient Implementation of Neural Networks using CUDA and OpenMP

Park, An-Jin (숭실대학교 미디어학과)
Jang, Hong-Hoon (숭실대학교 미디어학부)
Jung, Kee-Chul (숭실대학교 미디어학부)

Publication Information

Journal of KIISE:Software and Applications / v.36, no.4, 2009 , pp. 253-260 More about this Journal

Abstract

Many algorithms for computer vision and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation has two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job that needs much cooperation between CPU and GPU, which is usual in image processing and pattern recognition contrary to the graphic area, CPU should generate raw feature data for GPU processing as much as possible to effectively utilize GPU performance. This paper proposes more quick and efficient implementation of neural networks on both GPU and multi-core CPU. We use CUDA (compute unified device architecture) that can be easily programmed due to its simple C language-like style instead of GPU to solve the first problem. Moreover, OpenMP (Open Multi-Processing) is used to concurrently process multiple data with single instruction on multi-core CPU, which results in effectively utilizing the memories of GPU. In the experiments, we implemented neural networks-based text extraction system using the proposed architecture, and the computational times showed about 15 times faster than implementation on only GPU without OpenMP.

Keywords

CUDA; OpenMP; Neural Network; Text Extraction;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	K.S. Oh and K. Jung, 'GPU Implementation of Neural Network,' Pattern Recognition, Vol.37, Issue 6, pp. 1311-1314, 2004 DOI ScienceOn
2	http://www.opengl.org/documentation/glsl/
3	http://graphics.stanford.edu/projects/brookgpu/
4	K. Jung, 'Neural Network-based Text Localization in Color Images,' Pattern Recognition Letters, Vol.22, Issue 4, pp. 1503-1515, 2001 DOI ScienceOn
5	J. Mairal, R. Keriven, and A. Chariot, 'Fast and Efficient Dense Variational Stereo on GPU,' Proceedings of International Symposium on 3D Data Processing, Visualization, and Transmission, pp. 697-704, 2006
6	http://developer.nvidia.com/object/cg_toolkit.html/
7	K. Moreland and E. Angel, 'The FFT on a GPU,' Proceedings of SIGGRAPH Conference on Graphics Hardware, pp. 112-119, 2003
8	http://ati.amd.com/developer/
9	I. Geys and L.V. Gool, 'View Synthesis by the Parallel Use of GPU and CPU,' Image and Vision Computing, Vol.25, Issue 7, pp. 1154-1164, 2007 DOI ScienceOn
10	http://www.openmp.org/
11	http://www.nvidia.com/object/cuda_home.html/
12	R. Yang and G. Welch, 'Fast Image Segmentation and Smoothing using Commodity Graphics Hardware,' Journal of Graphics Tools, Vol.17, Issue 4, pp. 91-100, 2002

1	CALPUFF Module Acceleration with OpenMP / [Yu, Suk-Hyun;Yang, Jin-Uk;Kim, Kyung-Ho;Yun, Hui-Young;Koo, Youn-Seo;Kwon, Hee-Yong;] / Journal of KIISE:Computing Practices and Letters
2	Acceleration of Feature-Based Image Morphing Using GPU / [Kim, Eun-Ji;Yoon, Seung-Hyun;Lee, Jieun;] / Journal of the Korea Computer Graphics Society
3	Protein Thermal Stability Analysis Acceleration System based on Parallel Computing / [Kim, Dae-Hee;Kim, Hyo Jung;Kim, Minho;Lim, Myung Eun;Lee, Dae-Hee;Choi, Jaehoon;Jung, Hoyoul;] / Journal of KIISE:Software and Applications

KSCI

Fast and Efficient Implementation of Neural Networks using CUDA and OpenMP CUDA와 OPenMP를 이용한 빠르고 효율적인 신경망 구현

Fast and Efficient Implementation of Neural Networks using CUDA and OpenMP