• Title/Summary/Keyword: message passing interface (MPI)

Search Result 115, Processing Time 0.03 seconds

An Application-Level Fault Tolerant System For Synchronous Parallel Computation (동기 병렬연산을 위한 응용수준의 결함 내성 연산시스템)

  • Park, Pil-Seong
    • Journal of Internet Computing and Services
    • /
    • v.9 no.5
    • /
    • pp.185-193
    • /
    • 2008
  • An MTBF(mean time between failures) of large scale parallel systems is known to be only an order of several hours, and large computations sometimes result in a waste of huge amount of CPU time, However. the MPI(Message Passing Interface), a de facto standard for message passing parallel programming, suggests no possibility to handle such a problem. In this paper, we propose an application-level fault tolerant computation system, purely on the basis of the current MPI standard without using any non-standard fault tolerant MPI library, that can be used for general scientific synchronous parallel computation.

  • PDF

The Design of Hardware MPI Units for MPSoC (MPSoC를 위한 저비용 하드웨어 MPI 유닛 설계)

  • Jeong, Ha-Young;Chung, Won-Young;Lee, Yong-Surk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.1B
    • /
    • pp.86-92
    • /
    • 2011
  • In this paper, we propose a novel hardware MPI(Message Passing Interface) unit which supports message passing in multiprocessor system which use distributed memory architecture. MPI Hardware unit processes data synchronization, transmission and completion, and it supports processor non-blocking operation so it reduces overhead according to synchronization. Additionally, MPI hardware unit combines ready entry, request entry, reserve entry which save and manage the synchronized messages and performs the multiple outstanding issue and out of order completion. According to BFM(Bus Functional Model) simulation result, the performance is increased by 25% on many to many communication. After we designed MPI unit using HDL, with synopsys design compiler we synthesized, and for synthesis library we used MagnaChip $0.18{\mu}m$. And then we making prototype chip. The proposed message transmission interface hardware shows high performance for its increase in size. Thus, as we consider low-cost design and scalability, MPI hardware unit is useful in increasing overall performance of embedded MPSoC(Multi-Processor System-on-Chip).

An Implementation of Fault-Tolerant Message Passing Interface on Parallel Computers (병렬 컴퓨터에서의 결함 허용 메시지 전달 인터페이스 구현)

  • Song, Dae-Ki;Lee, Cheol-Hoon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.3
    • /
    • pp.319-328
    • /
    • 2000
  • The Message-Passing Interface(MPI) is a standard interface for parallel programming environment, based on that application programs run on the processors of a parallel computer. Processor nodes execute processes consisting the program by passing messages to one another. During executing, however, if a fault occurs on a processor node or a process, this will result an inconsistent state, and consequently, the whole program will have to be stopped. To solve this problem, in this paper, we propose a fault-tolerant message passing interface(FT-MPI) by adding a fault manager module to MPI. The proposed FT-MPI does not need any hardware support, and each application program based on MPI can run on the FT-MPI without any modification. The proposed fault tolerance scheme uses the so-called hot-spare process duplication method, and verified by simulations that application programs run despite of any fault with less than 5% overhead on execution time.

  • PDF

Implementation and Performance Evaluation of Socket and RMI based Java Message Passing Systems (소켓 및 RMI 기반 자바 메시지 전달 시스템의 구현 및 성능평가)

  • Bang, Seung-Jun;Ahn, Jin-Ho
    • Journal of Internet Computing and Services
    • /
    • v.8 no.5
    • /
    • pp.11-20
    • /
    • 2007
  • This paper designs and implements a message passing library called JMPI (Java Message Passing Interface) which complies with MPJ (Message Passing in Java), the MPI standard Specification for Java language, This library provides some graphic user interface tools to enable parallel computing environments to be configured very simply by their administrators and JMPI applications to be executed very conveniently. Also in this paper, we implement two versions of systems using Socket and RPC which are both typical distributed system communication mechanisms and with three benchmark applications, compare performance of these systems with that of an existing system JPVM depending on the increasing number of the computers. Experimental results show that our systems outperform JPVM system in terms of various aspects and that the most efficient processing speedup can be obtained by increasing the number of the computers in consideration of network traffic through processing evaluation. Finally, we can see that, as the number of computers increases, using RMI to transmit a message is more effective than using object streams attached to sockets to transmit a message.

  • PDF

Processing-Node Status-based Message Scattering and Gathering for Multi-processor Systems on Chip

  • Park, Jongsu
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.4
    • /
    • pp.279-284
    • /
    • 2019
  • This paper presents processing-node status-based message scattering and gathering algorithms for multi-processor systems on chip to reduce the communication time between processors. In the message-scattering part of the message-passing interface (MPI) scatter function, data transmissions are ordered according to the proposed linear algorithm, based on the processor status. The MPI hardware unit in the root processing node checks whether each processing node's status is 'free' or 'busy' when an MPI scatter message is received. Then, it first transfers the data to a 'free' processing node, thereby reducing the scattering completion time. In the message-gathering part of the MPI gather function, the data transmissions are ordered according to the proposed linear algorithm, and the gathering is performed. The root node receives data from the processing node that wants to transfer first, and reduces the completion time during the gathering. The experimental results show that the performance of the proposed algorithm increases at a greater rate as the number of processing nodes increases.

Parallel FFT and Quick-Merge Sort on the Reflective Memory Networked Computers and a Cluster of Work-stations

  • Lee, Changhun;Kwon, Wook-Hyun
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2002.10a
    • /
    • pp.94.1-94
    • /
    • 2002
  • This paper is concerned with parallel FFT and Quick-Merge Sort. They are implemented on computers interconnected by VMIC 5579 reflective memory and a cluster of workstations (PCs) interconnected via Fast Ethernet. Message passing interface (MPI) parallel library is used for communication in a cluster of workstations. An improved parallel FFT is also presented to decrease an execution time in the case of a small number of hosts. Distributed shared memory (DSM), VMIC 5579 reflective memory (RM), a cluster of workstations (COW) and message passing interface (MPI) parallel library are described.

  • PDF

Comparison of Message Passing Interface and Hybrid Programming Models to Solve Pressure Equation in Distributed Memory System (분산 메모리 시스템에서 압력방정식의 해법을 위한 MPI와 Hybrid 병렬 기법의 비교)

  • Jeon, Byoung Jin;Choi, Hyoung Gwon
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.39 no.2
    • /
    • pp.191-197
    • /
    • 2015
  • The message passing interface (MPI) and hybrid programming models for the parallel computation of a pressure equation were compared in a distributed memory system. Both models were based on domain decomposition, and two numbers of the sub-domain were selected by considering the efficiency of the hybrid model. The parallel performances for various problem sizes were measured using up to 96 threads. It was found that in addition to the cache-memory size, the overhead of the MPI communication/OpenMP directives affected the parallel performance. For small problems, the parallel performance was low because the percentage of the overhead of the MPI communication/OpenMP directives increased as the number of threads increased, and MPI was better than the hybrid model because it had a smaller communication overhead. For large problems, the parallel performance was high because, in addition to the cache effect, the percentage of the communication overhead was relatively low compared to that for small problems, and the hybrid model was better than MPI because the communication overhead of MPI was more dominant than that of the OpenMP directives in the hybrid model.

Implementation and Performance Analysis of High Performance Computing Library for Parallel Processing (병렬처리를 위한 고성능 라이브러리의 구현과 성능 평가)

  • 김영태;이용권
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.7
    • /
    • pp.379-386
    • /
    • 2004
  • We designed a portable parallel library HPCL(High Performance Computing Library) with following objectives: (1) to provide a close relationship between the parallel code and the original sequential code that will help future versions of the sequential code and (2) to enhance performance of the parallel code. The library is an interface written in C and Fortran programming languages between MPI(Message Passing Interface) and parallel programs in Fortran. Performance results were determined on clusters of PC's and IBM SP4.

Parallelization of Raster GIS Operations Using PC Clusters (PC 클러스터를 이용한 래스터 GIS 연산의 병렬화)

  • 신윤호;박수홍
    • Spatial Information Research
    • /
    • v.11 no.3
    • /
    • pp.213-226
    • /
    • 2003
  • With the increasing demand of processing massive geographic data, conventional GISs based on the single processor architecture appear to be problematic. Especially, performing complex GIS operations on the massive geographic data is very time consuming and even impossible. This is due to the processor speed development does not keep up with the data volume to be processed. In the field of GIS, this PC clustering is one of the emerging technology for handling massive geographic data effectively. In this study, a MPI(Message Passing Interface)-based parallel processing approach was conducted to implement the existing raster GIS operations that typically requires massive geographic data sets in order to improve the processing capabilities and performance. Specially for this research, four types of raster CIS operations that Tomlin(1990) has introduced for systematic analysis of raster GIS operation. A data decomposition method was designed and implemented for selected raster GIS operations.

  • PDF

Application of a Parallel Asynchronous Algorithm to Some Grid Problems on Workstation Clusters

  • Park, Pil-Seong
    • Ocean and Polar Research
    • /
    • v.23 no.2
    • /
    • pp.173-179
    • /
    • 2001
  • Parallel supercomputing is now a must for oceanographic numerical modelers. Most of today's parallel numerical schemes use synchronous algorithms, where some processors that have finished their tasks earlier than others must wait at synchronization points for correct computation. Hence, the load balancing is a crucial factor, however, it is, in general, difficult to achieve on heterogeneous workstation clusters. We devise an asynchronous algorithm that reduces the idle times of faster processors, and discuss application of the algorithm to some grid problems and implementation on a workstation cluster using Message Passing Interface (MPI).

  • PDF