DOI QR코드

DOI QR Code

Matrix Addition & Scalar Multiplication on the GPU

GPU 기반 행렬 덧셈 및 스칼라 곱셈 알고리즘

  • Park, Sangkun (Department of Mechanical Engineering, Korea National University of Transportation)
  • 박상근 (한국교통대학교 기계공학과)
  • Received : 2018.07.20
  • Accepted : 2018.10.29
  • Published : 2018.11.30

Abstract

Recently a GPU has acquired programmability to perform general purpose computation fast by running thousands of threads concurrently. This paper presents a parallel GPU computation algorithm for dense matrix-matrix addition and scalar multiplication using OpenGL compute shader. It can play a very important role as a fundamental building block for many high-performance computing applications. Experimental results on NVIDIA Quad 4000 show that the proposed algorithm runs 21 times faster than CPU algorithm and achieves performance of 16 GFLOPS in single precision for dense matrices with size 4,096. Such performance proves that our algorithm is practical for real applications.

Keywords

References

  1. https://www.nvidia.com/en-us/geforce/products/10series/geforce-gtx-1080-ti/.
  2. http://www.netlib.org/blas.
  3. https://software.intel.com/en-us/mkl.
  4. http://developer.amd.com/tools-and-sdks/archive/acml-product-features/.
  5. https://developer.nvidia.com/cublas.
  6. G. Sellers, R. S. Wright, and N. Haemel, OpenGL SuperBible (7th ed.), Addison-Wesley, 2015.