• Title/Summary/Keyword: MPI (Message Passing Interface)

Search Result 115, Processing Time 0.028 seconds

Construction of a CPU Cluster and Implementation of a 3-D Domain Decomposition Parallel FDTD Algorithm (CPU 클러스터 구축 및 3차원 공간분할 병렬 FDTD 알고리즘 구현)

  • Park, Sungmin;Chu, Kwang-Uk;Ju, Saehoon;Park, Yoon-Mi;Kim, Ki-Baek;Jung, Kyung-Young
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.25 no.3
    • /
    • pp.357-364
    • /
    • 2014
  • In this work, we construct a CPU cluster to implement a parallel finite-difference time domain(FDTD) algorithm for fast electromagnetic analyses. This parallel FDTD algorithm can reduce the computational time significantly and also analyze electrically larger structures, compared to a single FDTD counterpart. The parallel FDTD algorithm needs communication between neighboring processors, which is performed by the MPI(Message Passing Interface) library and a 3-D domain decomposition is employed to decrease the communication time between neighboring processors. Compared to a single-processor FDTD, the speed up factor of a-CPU-cluster-based parallel FDTD algorithm is investigated for the normal mode and the hypermode and finally analyze an electrically large concrete structure by the developed parallel algorithm.

Comparison of Parallel Computation Performances for 3D Wave Propagation Modeling using a Xeon Phi x200 Processor (제온 파이 x200 프로세서를 이용한 3차원 음향 파동 전파 모델링 병렬 연산 성능 비교)

  • Lee, Jongwoo;Ha, Wansoo
    • Geophysics and Geophysical Exploration
    • /
    • v.21 no.4
    • /
    • pp.213-219
    • /
    • 2018
  • In this study, we simulated 3D wave propagation modeling using a Xeon Phi x200 processor and compared the parallel computation performance with that using a Xeon CPU. Unlike the 1st generation Xeon Phi coprocessor codenamed Knights Corner, the 2nd generation x200 Xeon Phi processor requires no additional communication between the internal memory and the main memory since it can run an operating system directly. The Xeon Phi x200 processor can run large-scale computation independently, with the large main memory and the high-bandwidth memory. For comparison of parallel computation, we performed the modeling using the MPI (Message Passing Interface) and OpenMP (Open Multi-Processing) libraries. Numerical examples using the SEG/EAGE salt model demonstrated that we can achieve 2.69 to 3.24 times faster modeling performance using the Xeon Phi with a large number of computational cores and high-bandwidth memory compared to that using the 12-core CPU.

Prestack Depth Migration for Gas Hydrate Seismic Data of the East Sea (동해 가스 하이드레이트 탄성파자료의 중합전 심도 구조보정)

  • Jang, Seong-Hyung;Suh, Sang-Yong;Go, Gin-Seok
    • Economic and Environmental Geology
    • /
    • v.39 no.6 s.181
    • /
    • pp.711-717
    • /
    • 2006
  • In order to study gas hydrate, potential future energy resources, Korea Institute of Geoscience and Mineral Resources has conducted seismic reflection survey in the East Sea since 1997. one of evidence for presence of gas hydrate in seismic reflection data is a bottom simulating reflector (BSR). The BSR occurs at the interface between overlaying higher velocity, hydrate-bearing sediment and underlying lower velocity, free gas-bearing sediment. That is often characterized by large reflection coefficient and reflection polarity reverse to that of seafloor reflection. In order to apply depth migration to seismic reflection data. we need high performance computers and a parallelizing technique because of huge data volume and computation. Phase shift plus interpolation (PSPI) is a useful method for migration due to less computing time and computational efficiency. PSPI is intrinsically parallelizing characteristic in the frequency domain. We conducted conventional data processing for the gas hydrate data of the Ease Sea and then applied prestack depth migration using message-passing-interface PSPI (MPI_PSPI) that was parallelized by MPI local-area-multi-computer (MPI_LAM). Velocity model was made using the stack velocities after we had picked horizons on the stack image with in-house processing tool, Geobit. We could find the BSRs on the migrated stack section were about at SP 3555-4162 and two way travel time around 2,950 ms in time domain. In depth domain such BSRs appear at 6-17 km distance and 2.1 km depth from the seafloor. Since energy concentrated subsurface was well imaged we have to choose acquisition parameters suited for transmitting seismic energy to target area.

Parallel Distributed Implementation of GHT on MPI-based PC Cluster (MPI 기반 PC 클러스터에서 GHT의 병렬 분산 구현)

  • Kim, Yeong-Soo;Kim, Jeong-Sahm;Choi, Heung-Moon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.44 no.3
    • /
    • pp.81-89
    • /
    • 2007
  • This paper presents a parallel distributed implementation of the GHT (generalized Hough transform) for the fast processing on the MPI-based PC cluster. We tried to achieve the higher speedup mainly by alleviating the communication overhead through the pipelined broadcast and accumulator array partition strategy and by time overlapping of the communication and the computation over entire process. Experimental results show that nearly linear speedup is reachable by the proposed method on the MPI-based PC clusters connected through 100Mbps Ethernet switch.

A Simple and Fast Web Alignment Tool for Large Amount of Sequence Data

  • Lee, Yong-Seok;Oh, Jeong-Su
    • Genomics & Informatics
    • /
    • v.6 no.3
    • /
    • pp.157-159
    • /
    • 2008
  • Multiple sequence alignment (MSA) is the most important step for many of biological sequence analyses, homology search, and protein structural assignments. However, large amount of data make biologists difficult to perform MSA analyses and it requires much computational time to align many sequences. Here, we have developed a simple and fast web alignment tool for aligning, editing, and visualizing large amount of sequence data. We used a cluster server installed ClustalW-MPI using web services and message passing interface (MPI). It also enables users to edit multiple sequence alignments for manual editing and to download the input data and results such as alignments and phylogenetic tree.

Parallelization of a Two-Dimensional Navier-Stokes Solver Using Hybrid Meshes (혼합격자를 이용한 2차원 난류 유동장 해석 프로그램의 병렬화)

  • Ok Honam;Park Seung-O
    • 한국전산유체공학회:학술대회논문집
    • /
    • 1999.11a
    • /
    • pp.115-126
    • /
    • 1999
  • A two-dimensional Navier-Stokes solver using hybrid meshes is parallelized with a domain decompostion method. The focus of this paper is placed on minimizing the amount of effort in parallelizing the serial version of the solver, and this is achieved by adding an additional layer of cells to each decomposed domain. Most subroutines of the serial solver are used without modification, and the information exchange between neighboring domains is achieved using MPI(Message Passing Interface) library. Load balancing among the processors and scheduling of the message passing are implemented to reduce the overhead of parallelization, and the speed-up achieved by parallelization is measured on the transonic invisicd and turbulent flow problems. The parallelization efficiencies of the explicit Runge-Kutta scheme and the implicit point-SGS scheme are compared and the effects of various factors on the results are also studied.

  • PDF

Design and Implementation of a Grid System META for Executing CFD Analysis Programs on Distributed Environment (분산 환경에서 CFD 분석 프로그램 수행을 위한 그리드 시스템 META 설계 및 구현)

  • Kang, Kyung-Woo;Woo, Gyun
    • The KIPS Transactions:PartA
    • /
    • v.13A no.6 s.103
    • /
    • pp.533-540
    • /
    • 2006
  • This paper describes the design and implementation of a grid system META (Metacomputing Environment using Test-run of Application) which facilitates the execution of a CFD (Computational Fluid Dynamics) analysis program on distributed environment. The grid system META allows the CFD program developers can access the computing resources distributed over the network just like one computer system. The research issues involved in the grid computing include fault-tolerance, computing resource selection, and user-interface design. In this paper, we exploits an automatic resource selection scheme for executing the parallel SPMD (Single Program Multiple Data) application written in MPI (Message Passing Interface). The proposed resource selection scheme is informed from the network latency time and the elapsed time of the kernel loop attained from test-run. The network latency time highly influences the executional performance when a parallel program is distributed and executed over several systems. The elapsed time of the kernel loop can be used as an estimator of the whole execution time of the CFD Program due to a common characteristic of CFD programs. The kernel loop consumes over 90% of the whole execution time of a CFD program.

A Distributed Stock Cutting using Mean Field Annealing and Genetic Algorithm

  • Hong, Chul-Eui
    • Journal of information and communication convergence engineering
    • /
    • v.8 no.1
    • /
    • pp.13-18
    • /
    • 2010
  • The composite stock cutting problem is defined as allocating rectangular and irregular patterns onto a large composite stock sheet of finite dimensions in such a way that the resulting scrap will be minimized. In this paper, we introduce a novel approach to hybrid optimization algorithm called MGA in MPI (Message Passing Interface) environments. The proposed MGA combines the benefit of rapid convergence property of Mean Field Annealing and the effective genetic operations. This paper also proposes the efficient data structures for pattern related information.

Realtime Air Diffusion Prediction System

  • Kim Youngtae;Kim Tae KooK;Oh Jai-Ho
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.88-90
    • /
    • 2003
  • We implement Realtime Air Diffusion Prediction System which is designed for air diffusion simulations with four-dimensional data assimilation. For realtime running, we parallelize the system using MPI (Message Passing Interface) on distributed-memory parallel computers and build a cluster computer which links high-performance PCs with high-speed interconnection networks. We use 162­CPU nodes and a Myrinet network for the cluster

  • PDF

Implementation of MPI-based WiMAX Base Station for SDR System (SDR 시스템을 위한 MPI 기반 WiMAX 기지국의 구현)

  • Ahn, Chi Young;Kim, Hyo Han;Choi, Seung Won
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.9 no.4
    • /
    • pp.59-67
    • /
    • 2013
  • Compared to the conventional Hardware-oriented base stations, Software Defined Radio (SDR)-based base station provides various advantages especially in flexibility and expandability. It enables the multimode capability required in 4th-generation (4G) environment which aims at a convergence network of various kinds of communication standards. However, since a single base station processes all data required in various multiple waveforms, the SDR base station faces a problem of data processing speed. In this paper, we propose a new concept of SDR base station system which adopts a parallel processing technology of clustering environment. We implemented a WiMAX system with SDR concept which adopts the Message Passing Interface (MPI) technology which enables the speed-up operations. In order to maximize the efficiency of parallel processing in signal processing, we analyze how the algorithm at each of modules is related to data to be processed. Through the implemented system, we show a drastic improvement in operation time due to parallel processing using the proposed MPI technology. In addition, we demonstrate a feasibility of SDR system for 4G or even beyond-4G as well.