Browse > Article
http://dx.doi.org/10.22710/JICT.2019.9.1.001

Parallel Algorithm for Matrix-Matrix Multiplication on the GPU  

Park, Sangkun (Department of Mechanical Engineering, Korea National University of Transportation)
Publication Information
Journal of Institute of Convergence Technology / v.9, no.1, 2019 , pp. 1-6 More about this Journal
Abstract
Matrix multiplication is a fundamental mathematical operation that has numerous applications across most scientific fields. In this paper, we presents a parallel GPU computation algorithm for dense matrix-matrix multiplication using OpenGL compute shader, which can play a very important role as a fundamental building block for many high-performance computing applications. Experimental results on NVIDIA Quad 4000 show that the proposed algorithm runs about 208 times faster than previous CPU algorithm and achieves performance of 75 GFLOPS in single precision for dense matrices with matrix size 4,096. Such performance proves that our algorithm is practical for real applications.
Keywords
GPU algorithm; Matrix-matrix multiplication; OpenGL compute shader;
Citations & Related Records
연도 인용수 순위
  • Reference
1 http://www.netlib.org/blas.
2 https://software.intel.com/en-us/mkl.
3 http://developer.amd.com/tools-and-sdks/archive/acml-product-features/.
4 https://developer.nvidia.com/cublas.
5 https://docs.nvidia.com/cuda/nvblas/index.html.
6 G. Sellers, R. S. Wright, and N. Haemel, OpenGL SuperBible (7th ed.), Addison-Wesley, 2015.