Performance Comparison of Two Parallel LU Decomposition Algorithms on MasPar Machines

  • Kim, Yong-Tae (Dept. of Computer Science, Kangnung National Univ.)
  • 김영태 (국립 강릉대학교 컴퓨터과학과)
  • Published : 1998.12.01

Abstract

This paper presents a performance study of two LU decomposition algorithms on two massively parallel SIMD machines: the 16K processor MasPar MP-1 and the 4K processor MasPar MP-2. The paper presents experimental results and an analysis of the algorithms to explain the results. While the blocked and the nonblocked algorithms for LU decomposition have been studied individually by others, we compare the two algorithms and identify the tradeoffs between them. Our analysis of the blocked algorithm shows how the block size affects the interprocessor communication cost and the memory read/write overhead. The analysis in this paper is useful to determine an optimum block size for the blocked algorithm.

Keywords

References

  1. SIAM J. Matrix Anal. Appl. v.13 no.1 Efficient Matrix Multiplication on SIMD Computers Bajorstad, P.E.;Manne, F.;Sorevik, T.;Vajtersic, M.
  2. SIAM Review v.37 no.2 Software Libraries for Linear Algebra Computations on High Performance Computers Dongarra, J.J.;Walker, D.W.
  3. Solving Linear Systems on Vector and Shared Memory Computers Dongarra, J.J.;Duff, I.S.;Sorensen, D.C.;Vorst, H.A.
  4. Solving Problems on concurrent Processors Vol. I Fox, G.C.;Johnson, M.A.;Lyzenga, G.A.;Otto, S.W.;Salmon, J.K.;Walker, D.W.
  5. Parallel Numerical Algorithms Freeman, T.;Phillips, C.
  6. Parallel Algorithms For Dense Linear Algebra Computations Gallivan, K.A.;Plemmons, R.J.;Sameh, A.H.
  7. SIAM J. Sci. Stat. Comput. v.9 no.4 LU Factorization Algorithms on Distributed-Memory Multiprocessor Architectures Geist, G.A.;Romine, C.H.
  8. Parametric Micro-level Performance Models for Parallel Computing, Tech. Rep. TR94-23 Kim, Y.;Fienup, M.;Clary, J.;Kothari, S.C.
  9. Proceedings of '92 Conference on Super Computing On the Parallelization of Blocked LU Factorization Algorithms on Distributed Memory Architectures Laszewski, G.;Parashar, M.;Mohamed, A.G.;Fox, G.C.
  10. SlAM J. Sci. Stat. Comput. v.9 no.3 A Parallel Triangular Solver For A Distributed-Memory Multiprocessor Li, G.G.;Coleman, T.F.
  11. MasPar Assembly Language Reference Manual MasPar Computer Corporation
  12. Parallel Processing Letters v.2 no.1 Performance Estimation of LU factorization on Message Passing Multiprocessors Purushotham, B.V.;Basu, A.;Kumar, P.S.;Patnaik, L.M.