Browse > Article

Scalable Race Visualization for Debugging Message-Passing Programs  

Park Mi-Young (경상대학교 컴퓨터과학과)
Jun Yong-Kee (경상대학교 컴퓨터과학과)
Abstract
Detecting unaffected race conditions is important for debugging message-passing programs effectively, because such races can influence other races to occur or not. The previous technique used in detecting unaffected races detects a race by halting the execution of a process at the receive event of the race that errors first in the process. However this technique does not guarantee that all of the detected races are unaffected, because halting the execution of processes does disconnect some chains of affects-relations among those races. Tn this paper. we improved the second pass algorithm of the previous technique by producing information about affects-relations of the races that occur first in each Process. Then we effectively visualize affect-relations among the races detected in each process. This visualization is effective in detecting visually unaffected races by simplifying affects-relations among the races which occur first In each Process.
Keywords
message-passing programs; message race; debugging; unaffected races; visualization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Fidge, C. J., 'Partial Orders for Parallel Debugging,' SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, pp. 183-194, ACM, May 1988   DOI
2 Mattern, F., 'Virtual Time and Global States of Distributed Systems,' Parallel and Distributed Algorithms, pp. 215-226, Elsevier Science, North holland, 1989
3 L. Lamport, 'Time, Clocks, and the Ordering of Events in a Distributed System,' Comm. of the ACM, Vol.21, No.7, pp.558-564, Jul., 1978   DOI   ScienceOn
4 Netzer, R. H. B., and B. P. Miller, 'Optimal Tracing and Replay for Debugging Message-Passing Parallel Programs,' Supercomputing, pp. 502-511, IEEE/ACM, Nov. 1992   DOI
5 Tai, K C. 'Reachability Testing of Asynchronous Message-Passing Programs,' Int'l. Symp. on Software Engineering for Parallel and Dist. Systems, pp. 50-61, IEEE, May 1997   DOI
6 Tai, K C. 'Race Analysis of Traces of Asynchronous Message-Passing Programs,' Int'l. Conf, Distributed Computing Systems, pp. 261-268, IEEE, May 1997   DOI
7 Cypher, R., and E. Leu, 'Efficient Race Detection for Message-Passing Programs with Nonblocking Sends and Receives,' 7th Symp. on Parallel and Distributed Processing, pp. 534-541, IEEE, San Antonio, Texas, 1995   DOI
8 Damodaran-Kamal, S. K, and J. M. Francioni, 'Nondeterminacy: Testing and Debugging in Message Passing Parallel Programs,' ACM/ONR Workshop on Parallel and Distributed Debugging, Sigplan Notices, 28(12): 118-128, ACM, Dec. 1993
9 Netzer, R. H. B., T. W. Brennan, and K D. Suresh, 'Debugging Race Conditions in Message-Passing Programs,' SIGMETRICS Symp. on Parallel and Distributed Tools, ACM, May 1996   DOI
10 Gropp, W. and E. Lusk, User's Guide for Mpich, A Portable Implementation of MPI, TR-ANL-96/6, Argonne National Laboratory, 1996
11 Cypher, R, and E. Leu, 'The Semantics of Blocking and Nonblocking Send and Receive Primitives,' 8th Intl. Parallel Processing Symp., pp. 729-735, IEEE, Apr. 1994   DOI
12 Geist, A., A. Beguelin. J. Dongarra, W. Jiang, R Manchek, and V. Sunderam. 'PVM: Parallel Virtual Machine,' A Users' Guide and Tutorial for Networked Parallel Computing, Cambridge, MIT Press, 1994
13 Snir, M., S. Otto, S. Huss-Lederman, D. Walker, MPI: The Complete Reference, MIT Press, 1996
14 Damodaran-Kamal, S. K. and J. M. Francioni, 'Testing Races in Parallel Programs with an OtOt Strategy,' Int'l Symp. on Software Testing and Analysis, pp. 216-227, ACM, Aug. 1994   DOI
15 Kilgore, R. and C. Chase, 'Re-execution of Distributed Programs to Detect Bugs Hidden by Racing Messages,' 30th Annual Hawaii Int'l. Conference on System Sciences, Vol. 1, pp. 423-432, Jan. 1997   DOI