Browse > Article
http://dx.doi.org/10.3745/KIPSTA.2005.12A.5.421

An Application-Level Fault Tolerant Linear System Solver Using an MPMD Type Asynchronous Iteration  

Park, Pil-Seong (수원대학교 IT대학 컴퓨터학과)
Abstract
In a large scale parallel computation, some processor or communication link failure results in a waste of huge amount of CPU hours. However, MPI in its current specification gives the user no possibility to handle such a problem. In this paper, we propose an application-level fault tolerant linear system solver by using an MPMD-type asynchronous iteration, purely on the basis of the MPI standard without using any non-standard fault-tolerant MPI library.
Keywords
Fault Tolerant; Multiple Program Multiple Data; Linear System; Asynchronous Iteration; Message Passing Interface;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 C. Chang, G. Czajkowski, T. von Eicken, and C. Kesselman, 'Evaluating the performance limitation of MPMD communication,' In Proceedings of SC '97, San Jose, CA, November, pp.15-91, 1997   DOI
2 D. Chazan and W. Miranker, 'Chaotic relaxation,' Linear Algebra and Its Applications, Vol.2, pp.199-222, 1969   DOI   ScienceOn
3 R. Cole and Z. Ofer, 'An asynchronous parallel algorithm for undirected graph connectivity,' TR-546, Dept. of Computer Science, New York University, Feb., 1991
4 G. E. Fagg, E. Gabriel, Z. Chen, T. Angskun, G. Bosilca, A Bukovsky, & J. J. Dongarra, 'Fault tolerant communication library and applications for high performance computing,' Proceedings of the Los Alamos Computer Science Institute Symposium 2003, Santa Fe, NM., http://icl.cs.utk.edu/news_pub/ submissions/lacsi2003-ftmpi-fagg .pdf
5 MPI Forum. 1995. MPI: A Message-Passing Interface standard
6 I. T. Foster, 'Designing and building parallel programs,' Addison-Wesley Publishing Company, Reading, Massachusetts, 1995
7 Frommer, A., Schwandt, H. and Szyld, D. B. (1997). 'Asynchronous weighted additive Schwarz methods,' Electronic Transactions on Numerical Analysis, vol.5, pp.48-67
8 박필성, 신순철, '비동기 알고리즘을 이용한 분산 메모리 시스템에서의 초대형 선형 시스템 해법의 성능 향상', 한국정보처리학회 논문지 8-A권, 제4호, pp.439-446, 2000   과학기술학회마을
9 D. B. Szyld, 'Different models of parallel asynchronous iterations with overlapping blocks,' Computational and Applied Mathematics, Vol.17, pp.101-115, 1998
10 Y. Su and A. Bhaya, 'Convergence of pseudocontractions and applications to two-stage and asynchronous multisplitting for singular M-matrices,' SIAM J. Matrix Analysis & Applications, Vol.22, pp.948-964, 2001   DOI   ScienceOn
11 A. Uresin and M. Dubois, 'Parallel asynchronous algorithms for discrete data,' Journal of ACM, Vol.37, pp.588-606, 1990   DOI
12 R. Bru, V. Migallon, J. Penades, and D. B. Szyld, 'Parallel, synchronous and asynchronous two-stage multisplitting methods,' Electronic Transactions on Numerical Analysis, Vol.3, pp.24-38, 1995