[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.9708/jksci.2021.26.01.001

Parallel Algorithm of Conjugate Gradient Solver using OpenGL Compute Shader

Va, Hongly (School of Software Convergence, Soonchunhyang University)
Lee, Do-keyong (School of Software Convergence, Soonchunhyang University)
Hong, Min (Dept. of Software Convergence, Soonchunhyang University)

Publication Information

Journal of the Korea Society of Computer and Information / v.26, no.1, 2021 , pp. 1-9 More about this Journal

Abstract

OpenGL compute shader is a shader stage that operate differently from other shader stage and it can be used for the calculating purpose of any data in parallel. This paper proposes a GPU-based parallel algorithm for computing sparse linear systems through conjugate gradient using an iterative method, which perform calculation on OpenGL compute shader. Basically, this sparse linear solver is used to solve large linear systems such as symmetric positive definite matrix. Four well-known matrix formats (Dense, COO, ELL and CSR) have been used for matrix storage. The performance comparison from our experimental tests using eight sparse matrices shows that GPU-based linear solving system much faster than CPU-based linear solving system with the best average computing time 0.64ms in GPU-based and 15.37ms in CPU-based.

Keywords

Linear Solving; Conjugate Gradient; Sparse Matrix; Parallel GPU; OpenGL Compute Shader;

Citations & Related Records

Reference

1	J. Sim, et al, "A Performance Analysis Framework for Identifying Potential Benefits in GPGPU Applications," Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Vol. 47, No. 8, pp. 11-22, 2012. DOI: https://doi.org/10.1145/2145816.2145819 DOI
2	S. Abbasbandy, A. Jafarian, and R. Ezzati, "Conjugate gradient method for fuzzy symmetric positive definite system of linear equations," Applied Mathematics and Computation, Vol. 171, No. 2, pp. 1184-1191, 2005. DOI: https://doi.org/10.1016/j.amc.2005.01.110 DOI
3	M. Kryshchuk, and J. Lavendels, "Iterative Method for Solving a System of Linear Equations," Procedia Computer Science, Vol. 104, pp. 133-137, 2017. DOI: https://doi.org/10.1016/j.procs.2017.01.085 DOI
4	D. Gerzhoy, X. Sun, M. Zuzak, and D. Yeung, "Nested MIMD-SIMD Parallelization for Heterogeneous Microprocessors," ACM Trans. Archit, Vol. 16, No. 4, 2019. DOI: https://doi.org/10.1145/3368304 DOI
5	W. Shin, K. H. Yoo and N. Baek, "Large-Scale Data Computing Performance Comparisons on SYCL Heterogeneous Parallel Processing Layer Implementations," In Appl. Sci., Vol. 10, No. 5. pp. 1656, 2020. DOI: https://doi.org/10.3390/app10051656 DOI
6	J. S. Kirtzic, "A parallel algorithm design model for the gpu architecture," Ph.D. Dissertation. University of Texas at Dallas, USA. Advisor(s) Ovidiu Daescu., 2012.
7	M. Ament et al, "A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform," 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, Pisa, pp. 583-592, 2010. DOI: https://doi.org/10.1109/PDP.2010.51 DOI
8	R. Helfenstein, and J. Koko, "Parallel preconditioned conjugate gradient algorithm on GPU," Journal of Computational and Applied Mathematics, Vol. 236, No. 15. pp. 3584-2590, 2012. DOI: https://doi.org/10.1016/j.cam.2011.04.025 DOI
9	W. Cao, L. Yao, Zo. Li, Y. Wang and Z. Wang, "Implementing Sparse Matrix-Vector multiplication using CUDA based on a hybrid sparse matrix format," 2010 International Conference on Computer Application and System Modeling, pp. V11-161-V11-165, 2010. DOI: https://doi.org/10.1109/ICCASM.2010.5623237 DOI
10	N. Bell and M. Garland, "Efficient Sparse Matrix-Vector Multiplication on CUDA," NVIDIA Technical Report NVR-2008-004, December 2008.
11	D. Shreiner, G. Sellers, J. Kessenich and B. Licea-Kane , "OpenGL programming guide: The Official guide to learning OpenGL," version 4.3. Addison-Wesley, 2013.
12	A. C. Ahamed, and F. Magoules, "Conjugate gradient method with graphics processing unit acceleration: CUDA vs OpenCL," Advances in Engineering Software, Vol. 111, pp. 32-42, 2017. DOI: https://doi.org/10.1016/j.advengsoft.2016.10.002 DOI
13	M. M. Baskaran, and R. Bordawekar, "Optimizing Sparse Matrix-Vector Multiplications on GPUs," Computer Science. Vol. 8, pp. 812-47, 2009.
14	A. Benatia, W. Ji, Y. Wang and F. Shi, "Sparse Matrix Format Selection with Multiclass SVM for SpMV on GPU," 2016 45th International Conference on Parallel Processing (ICPP), Philadelphia, pp. 496-505, 2016. DOI: https://doi.org/10.1109/ICPP.2016.64 DOI
15	G. E. Blelloch, et al, "Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors," Technical Report. Carnegie Mellon University, USA. 1993.
16	M. R. Hestenes, and E. Stiefel, "Methods of conjugate gradients for solving linear systems," Journal of research of the National Bureau of Standards, Vol. 49, pp. 409-436, 1952. DOI: https://doi.org/10.6028/jres.049.044 DOI
17	SuiteSparse Matrix Collection, https://sparse.tamu.edu/
18	N. Bell and M. Garland, "Implementing sparse matrix-vector multiplication on throughput-oriented processors," Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, Portland, pp. 1-11, 2009. DOI: https://doi.org/10.1145/1654059.1654078 DOI
19	D. Mukunoki, K. Ozaki, T. Ogita, and R. Iakymchuk "Conjugate Gradient Solvers with High Accuracy and Bit-wise Reproducibility between CPU and GPU using Ozaki scheme," In The International Conference on High Performance Computing in Asia-Pacific Region, pp. 100-109, 2021. DOI: https://doi.org/10.1145/3432261.3432270 DOI
20	Pikle, N.K., Sathe, S.R. and Vyavhare, A.Y. "GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review," Sadhana, Vol.43, No. 111 2018. DOI: https://doi.org/10.1007/s12046-018-0892-0 DOI
21	A. Bunse-Gerstner, and R. Stover, "On a conjugate gradient-type method for solving complex symmetric linear systems," Linear Algebra and its Applications, Vol. 287, No. 1-3, pp. 105-123, 1999. DOI: https://doi.org/10.1016/S0024-3795(98)10091-5 DOI
22	A. Cano, "A survey on graphic processing unit computing for large-scale data mining," WIREs Data Mining and Knowledge Discovery, Vol, 8, No. 1, 2018. DOI: https://doi.org/10.1002/widm.1232 DOI
23	N. J. Sung, Y. J. Choi and M. Hong, "Parallel Structure Design Method for Mass Spring Simulation," Journal of the Korea Computer Graphics Society. Vol. 25, pp. 55-63, 2019. DOI: https://doi.org/10.15701/kcgs.2019.25.3.55 DOI