[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7838/jsebs.2018.23.1.037

A Study on GPGPU Performance Improvement Technique on GCN Architecture Using OpenCL API

Woo, DongHee (Graduate School of Computer Science, Sangmyung University)
Kim, YoonHo (Department of Computer Science, Sangmyung University)

Publication Information

The Journal of Society for e-Business Studies / v.23, no.1, 2018 , pp. 37-45 More about this Journal

Abstract

The current system upon which a variety of programs are in operation has continuously expanded its domain from conventional single-core and multi-core system to many-core and heterogeneous system. However, existing researches have focused mostly on parallelizing programs based CUDA framework and rarely on AMD based GCN-GPU optimization. In light of the aforementioned problems, our study focuses on the optimization techniques of the GCN architecture in a GPGPU environment and achieves a performance improvement. Specifically, by using performance techniques we propose, we have reduced more then 30% of the computation time of matrix multiplication and convolution algorithm in GPGPU. Also, we increase the kernel throughput by more then 40%.

Keywords

OpenCL; Optimization; GP-GPU; GCN Architecture; GPU;

Citations & Related Records

Reference

1	AMD OpenCL Programming User Guide.
2	Aritsugi, M., Fukatsu, H., and Kanamori, Y., “Parallel Image Convolution Processing with Replicas in a Network of Workstations,” Institute of Electronics Information and Communication, Vol. 88, No. 6, pp. 1199-1209, 2005.
3	Choi, H. J. and Kim, C. H., "Performance Evaluation of the GPU Architecture Executing Parallel Applications," The Korea Contents Society, Vol. 12, No. 5, 10-21, 2012.
4	Fraire, J. A., Ferreyra, A., and Marques, C., “OpenCL Overview, Implementation, and Performance Comparison,” IEEE, Vol. 11, No. 1, pp. 274-280, 2013.
5	http://www.amd.com/ko-kr.
6	http://www.khronos.org/opencl/.
7	Huang, D., Wen, M., Xun, C., Chen, D., Cai, X., Qiao, Y., Wu, N., and Zhang, C., "Automated Transformation of GPU-Specific OpenCL Kernels Targeting Performance Portability on Muiti-Core/Many-Core CPUs," Lecture Notes in Computer Science, No. 8632, pp. 210-221, 2014.
8	Jung, H. I., Park, I. S., and Ahn, H. C., “Identifying the Key Success Factors of Massively Multiplayer Online Role Playing Game Design using Artificial Neural Networks,” The Journal of Society for e-Business Studies, Vol. 17, No. 1, pp. 23-38, 2012. DOI
9	Lee, D., Dinov, I., Dong, B., Gutman, B., Yanovsky, I., and Toga, A. W., “CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms,” Computer Methods and Programs in Biomedicine, Vol. 106, No. 3, pp. 175-187, 2012. DOI
10	Lee, S. G., “Enhancing Performance of Embedded System using FPGA Processor,” Namseoul University Press, Vol. 7, No. 1, pp. 56-67, 2010.
11	Lee, Y. H. and Kim, Y. J., “Parallel Intersection Detection Algorithm using CUDA,” HCI, Vol. 2008, No. 2, pp. 451-455, 2008.
12	Moon, H. J., Jeon, J. N., and Kim, S., “A Performance Analysis for Benchmarks on Heterogeneous Environment,” KISS, Vol. 23, No. 2B, pp. 1635-1638, 1996.
13	Oyarzun, G., Borrell, R., Gorobets, A., and Oliva, A., "MPI-CUDA sparse matrixvector multiplication for the conjugate gradient method with an approximate inverse preconditioner," Computers & Fluids, Vol. 92, pp. 244-252, 2014. DOI
14	Venetillo, J. S. and Celes, W., "GPU-based particle simulation with inter-collisions," The Visual Computer, Vol. 23, No. 9-11, pp. 851-860, 2007 DOI

KSCI

A Study on GPGPU Performance Improvement Technique on GCN Architecture Using OpenCL API GCN 아키텍쳐 상에서의 OpenCL을 이용한 GPGPU 성능향상 기법 연구

A Study on GPGPU Performance Improvement Technique on GCN Architecture Using OpenCL API