Search | Korea Science

MPIRace-Check V 1.0: A Tool for Detecting Message Races in MPI Parallel Programs (MPIRace-Check V 1.0: MPI 병렬 프로그램의 메시지경합 탐지를 위한 도구)

Park, Mi-Young;Chung, Sang-Hwa
- The KIPS Transactions:PartA
- /
- v.15A no.2
- /
- pp.87-94
- /
- 2008
Message races should be detected for debugging effectively message-passing programs because they can cause non-deterministic executions of a program. Previous tools for detecting message races report that message races occur in every receive operation which is expected to receive any messages. However message races might not occur in the receive operation if each of messages is transmitted through a different logical communication channel so that their incorrect detection makes it a difficult task for programmers to debug programs. In this paper we suggest a tool, MPIRace-Check, which can exactly detect message races by checking the concurrency between send/receive operations, and by inspecting the logical communication channels of the messages. To detect message races, this tool uses the vector timestamp to check if send and receive operations are concurrent during an execution of a program and it also uses the message envelop to inspect if the logical communication channels of transmitted messages are the same. In our experiment, we show that our tool can exactly detect message races with efficiency using MPI_RTED and a benchmark program. By detecting message races exactly, therefore, our tool enables programmers to develop reliable parallel programs reducing the burden of debugging.
https://doi.org/10.3745/KIPSTA.2008.15-A.2.087 인용 PDF KSCI

Efficient Checkpoint Algorithm for Message-Passing Parallel Applications on Cloud Computing (클라우드컴퓨팅에서 메시지패싱방식 응용프로그램의 효율적인 체크포인트 알고리즘)

Le, Duc Tai;Dao, Manh Thuong Quan;Ahn, Min-Joon;Choo, Hyun-Seung
- Proceedings of the Korea Information Processing Society Conference
- /
- 2011.04a
- /
- pp.156-157
- /
- 2011
In this work, we study the checkpoint/restart problem for message-passing parallel applications running on cloud computing environment. This is a new direction which arises from the trend of enabling the applications to run on the cloud computing environment. The main objective is to propose an efficient checkpoint algorithm for message-passing parallel applications considering communications with external systems. We further implement the novel algorithm by modifying gSOAP and OpenMPI (the open source libraries) which support service calls and checkpoint message-passing parallel programs, especially. The simulation showed that additional costs to the executing and checkpointing application of the algorithm are negligible. Ultimately, the algorithm supports efficiently the checkpoint/restart service for message-passing parallel applications, that send requests to external services.
https://doi.org/10.3745/PKIPS.y2011m04a.156 인용 PDF

On-the-fly Detection of Race Conditions in Message-Passing Programs (메시지 전달 프로그램에서의 수행 중 경합탐지)

Park, Mi-Young;Kang, Moon-Hye;Jun, Yong-Kee;Park, Hyuk-Ro
- Journal of KIISE:Computer Systems and Theory
- /
- v.34 no.7
- /
- pp.267-275
- /
- 2007
Message races should be detected for debugging message-passing parallel programs because they can cause non-deterministic executions. Specially, it is important to detect the first race in each process because the first race can cause the occurrence of the other races in the same process. The previous techniques for detecting the first races require more than two monitored runs of a program or analyze a trace file which size is proportional to the number of messages. In this paper we introduce an on-the-fly technique to detect the first race in each process without generating any trace file. In the experiment we test the accuracy of our technique with some benchmark programs and it shows that our technique detects the first race in each process in all benchmark programs.
PDF KSCI

Race State Transition for Detecting Unaffected Race Conditions in Message-Passing Programs (메시지전달 프로그램의 영향받지 않은 경합조건 탐지를 위한 경합상태 전이기법)

Park Mi-Young;Kang Hyun-Syug;Jun Yong-Kee
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.8
- /
- pp.495-504
- /
- 2006
Detecting unaffected race conditions is important to debugging message-passing programs effectively, because such a message race can affect other races to occur or not. The previous technique to detect efficiently unaffected races detects racing messages by halting at the receive event of the first race to occur in each process. However this technique does not guarantee that all of the detected races are unaffected, because halting such processes does disconnect some chain of affects-relations among those races. In this paper, we present a novel technique that manages the state of the detected race by examining if every received message is affected until the execution terminates. Our technique therefore guarantees to detect efficiently the unaffected races, because it maintains affects-relations of the races all along the execution of program.
PDF KSCI

Implementation and Performance Analysis of High Performance Computing Library for Parallel Processing (병렬처리를 위한 고성능 라이브러리의 구현과 성능 평가)

김영태;이용권
- Journal of KIISE:Computer Systems and Theory
- /
- v.31 no.7
- /
- pp.379-386
- /
- 2004
We designed a portable parallel library HPCL(High Performance Computing Library) with following objectives: (1) to provide a close relationship between the parallel code and the original sequential code that will help future versions of the sequential code and (2) to enhance performance of the parallel code. The library is an interface written in C and Fortran programming languages between MPI(Message Passing Interface) and parallel programs in Fortran. Performance results were determined on clusters of PC's and IBM SP4.
PDF KSCI

Scalable Race Visualization for Debugging Message-Passing Programs (메시지전달 프로그램의 디버깅을 위한 경합의 확장적 시각화)

Park Mi-Young;Jun Yong-Kee
- Journal of KIISE:Computer Systems and Theory
- /
- v.32 no.7
- /
- pp.341-348
- /
- 2005
Detecting unaffected race conditions is important for debugging message-passing programs effectively, because such races can influence other races to occur or not. The previous technique used in detecting unaffected races detects a race by halting the execution of a process at the receive event of the race that errors first in the process. However this technique does not guarantee that all of the detected races are unaffected, because halting the execution of processes does disconnect some chains of affects-relations among those races. Tn this paper. we improved the second pass algorithm of the previous technique by producing information about affects-relations of the races that occur first in each Process. Then we effectively visualize affect-relations among the races detected in each process. This visualization is effective in detecting visually unaffected races by simplifying affects-relations among the races which occur first In each Process.
PDF KSCI

An Implementation of Fault-Tolerant Message Passing Interface on Parallel Computers (병렬 컴퓨터에서의 결함 허용 메시지 전달 인터페이스 구현)

Song, Dae-Ki;Lee, Cheol-Hoon
- Journal of KIISE:Computing Practices and Letters
- /
- v.6 no.3
- /
- pp.319-328
- /
- 2000
The Message-Passing Interface(MPI) is a standard interface for parallel programming environment, based on that application programs run on the processors of a parallel computer. Processor nodes execute processes consisting the program by passing messages to one another. During executing, however, if a fault occurs on a processor node or a process, this will result an inconsistent state, and consequently, the whole program will have to be stopped. To solve this problem, in this paper, we propose a fault-tolerant message passing interface(FT-MPI) by adding a fault manager module to MPI. The proposed FT-MPI does not need any hardware support, and each application program based on MPI can run on the FT-MPI without any modification. The proposed fault tolerance scheme uses the so-called hot-spare process duplication method, and verified by simulations that application programs run despite of any fault with less than 5% overhead on execution time.
PDF

Large Eddy Simulation of Turbulent Flow around a Ship Model Using Message Passing Interface (병렬계산기법을 이용한 선체주위 점성유동장의 LES해석)

Choi, Hee-Jong;Yoon, Hyun-Sik;Chun, Ho-Hwan;Kang, Dae-Hwan;Park, Jong-Chun
- Journal of Ocean Engineering and Technology
- /
- v.20 no.4 s.71
- /
- pp.76-82
- /
- 2006
The large-eddy simulation(LES) technique, based an a message passing interface method(MPI), was applied to investigate the turbulent flaw phenomena around a ship. The Smagorinski model was used in the present LES simulation to simulate the turbulent flaw around a ship. The SPMD(sidsngle program multiple data) technique was used for parallelization of the program using MPI. All computations were performed an a 24-node PC cluster parallel machine, composed of 2.6 GHz CPU, which had been installed in the Advanced Ship Engineering Research Center(ASERC). Numerical simulations were performed for the Wigley hull, and the Series 60 hull(CB=0.6) using 1/4-, 1/2-, 1- and 2-million grid systems and the computational results had been compared to the experimental ones.
PDF KSCI

A Dynamic Co-scheduling Scheme for MPI-based Parallel Programs on Linux Clusters (리눅스 클러스터에서 MPI 기반 병렬 프로그램의 동적 동시 스케줄링 기법)

Kim, Hyuk;Rhee, Yun-Seok
- Journal of the Korea Society of Computer and Information
- /
- v.13 no.1
- /
- pp.29-35
- /
- 2008
For efficient message passing of Parallel programs, it is required to schedule the involved two processes at the same time which are executed on different nodes, that is called 'co-scheduling' However, each node of cluster systems is built on top of general purpose multitasking OS. which autonomously manages local Processes. Thus it is not so easy to co-schedule two (or more) processes in such computing environment. Our work proposes a co-scheduling scheme for MPI-based parallel programs which exploits message exchange information between two parties. We implement the scheme on Linux cluster which requires slight kernel hacking and MPI library modification. The experiment with NPB parallel suite shows that our scheme results in 33-56% reduction in the execution time compared to the typical scheduling case. and especially better Performance in more communication-bound applications.
PDF

Communication Schedule for GEN_BLOCK Redistribution (GEN_BLOCK간 재분산을 위한 통신 스케줄)

Yook, Hyun-Gyoo;Park, Myong-Soon
- Journal of KIISE:Computer Systems and Theory
- /
- v.27 no.5
- /
- pp.450-463
- /
- 2000
Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. GEN_BLOCK redistribution, which is redistribution between different GEN_BLOCKs, is essential for load balancing. However, prior research on redistribution has been focused on regular redistribution, such as redistribution between different CYCLIC(N)s. GEN_BLOCK redistribution is very different from regular redistribution. Message passing in regular redistribution involves repetitions of basic message passing patterns, while message passing for GEN_BLOCK redistribution shows locality. This paper proves that two optimal condition, reducing the number of communication steps and minimizing redistribution size, are essential in GEN_BLOCK redistribution. Additionally, by adding a relocation phase to list scheduling, we make an optimal scheduling algorithm for GEN_BLOCK redistribution. To evaluate the performance of the algorithm, we have performed experiments on a CRAY T3E. According to the experiments, it was proven that the scheduling algorithm shows better performance and that the conditions are critical in enhancing the communication speed of GEN_BLOCK redistribution.
PDF

Search Result 27, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)