비동기 알고리즘을 이용한 분산 메모리 시스템에서의 초대형 선형 시스템 해법의 성능 향상

Improving Performance of Large Sparse Linear System Solvers On Distributed Memory Systems By Asynchronous Algorithms

  • 박필성 (수원대학교 컴퓨터과학과) ;
  • 신순철 (삼성종합기술원 CSE Center 연구원)
  • 발행 : 2001.12.01

초록

현재 대부분의 병렬 알고리즘은 동기 알고리즘으로 올바른 계산을 위해서는 프로세서들의 동기화와 부하균형이 필수적이다. 만일 부하균형이 불가능하거나 이질적 클러스터처럼 각 프로세서의 성능이 다른 경우, 연산은 가장 느린 프로세서의 성능에 의해 결정된다. 비동기 반복법은 이런 문제를 해결하는 하나의 방안으로 각광받고 있으나, 현재까지의 연구는 비교적 구현이 쉬운 공유 메모리 시스템을 사용한 것이었다. 본 논문에서는 분산 메모리 환경에서 초대형 선형 시스템 문제를 풀기 위해, 빠른 프로세서의 유휴 시간을 최대한 줄임으로써 전체적으로 성능을 향상시키는 비동기 병렬 알고리즘을 제안하고 이를 클러스터에 구현하였다.

The main stream of parallel programming today is using synchronous algorithms, where processor synchronization for correct computation and workload balance are essential. Overall performance of the whole system is dependent upon the performance of the slowest processor, if workload is not well-balanced or heterogeneous clusters are used. Asynchronous iteration is a way to mitigate such problems, but most of the works done so far are for shared memory systems. In this paper, we suggest and implement a parallel large sparse linear system solver that improves performance on distributed memory systems like clusters by reducing processor idle times as much as possible by asynchronous iterations.

키워드

참고문헌

  1. M. Baker and R. Buyya, 'Cluster computing at a glance,' in High performance cluster computing : Architecture and systems, Vol.1, R. Buyya, ed., Prentice Hall, 1999
  2. B. Baran, E. Kaszkurewicz, and A. Bhaya, 'Parallel asynchronous team algorithms : Convergence and performance analysis,' IEEE Transactions on Parallel & Distributed Systems, Vol.7, pp.677-688, 1996 https://doi.org/10.1109/71.508248
  3. R. Bru, V. Migallon, J. Penades, and D.B. Szyld, 'Parallel, synchronous and asynchronous two-stage multisplitting methods,' Electronic Transactions on Numerical Analysis, Vol.3, pp.24-38, 1995
  4. D. Chazan and W. Miranker, 'Chaotic relaxation,' Linear Algebra and Its Applications, Vol.2, pp.199-222, 1969 https://doi.org/10.1016/0024-3795(69)90028-7
  5. R. Cole and Z. Ofer, 'An asynchronous parallel algorithm for undirected graph connectivity,' TR-546, Dept. of Computer Science, New York University, Feb. 1991
  6. I.T. Foster, 'Designing and building parallel programs,' Addison-Wesley Publishing Company, Reading, Massachusetts, 1995
  7. A. Frommer, H. Schwandt, and D.B. Szyld, 'Asynchronous weighted additive Schwarz methods,' Electronic Transactions on Numerical Analysis, Vol.5, pp.48-67, 1997
  8. A. Frommer and D.B. Szyld, 'On asynchronous iterations,' Research Report 99--5-31, Dapartment of Mathematics, Temple University, 1999
  9. G.H. Golub and C. F. Van Loan, 'Matrix computations,' 3rd Ed., Johns Hopkins University Press, Baltimore, 1996
  10. L. Kaufman, 'Matrix methods for queuing problems,' SIAM J. Sci. Stat. Comput., Vol.4, pp.525-552, 1983 https://doi.org/10.1137/0904037
  11. E.J. Lu, M.G. Hilgers, and B. McMillin, 'Asynchronous parallel schemes : A survey,' Technical Report CSC 93-19, Dept. of Computer Science, University of Missouri-Rolla, Nov. 1993
  12. MPI Forum, 'MPI : A Message-Passing Interface standard,' 1995
  13. P.S. Park, 'A domain decomposition method applied to queuing network problems,' Comm. Kor. Math. Soc., Vol.10, pp.735-750, 1995
  14. Scyld Computing Corporation, 'The Beowulf Project,' http://www.beowulf.org/, 2001
  15. L.M. Silva and R. Buyya, 'Parallel programming models and paradigms,' in High performance cluster computing : Programming and application issues, Vol.2, R. Buyya, ed., Prentice Hall, 2000
  16. D.B. Szyld, 'Different models of parallel asynchronous iterations with overlapping blocks,' Computational and Applied Mathematics, Vol.17, pp.101-115, 1998
  17. A. Uresin and M. Dubois, 'Parallel asynchronous algorithms for discrete data,' Journal of ACM, Vol.37, pp.588-606, 1990 https://doi.org/10.1145/79147.79162