• Title/Summary/Keyword: message passing interface (MPI)

Search Result 115, Processing Time 0.03 seconds

The Mixed Finite Element Analysis for Porous Media using Domain Decomposition Method (영역 분할기법을 이용한 포화 다공질매체의 혼합유한요소해석)

  • Lee, Kyung-Jae;Tak, Moon-Ho;Kang, Yoon-Sik;Park, Tae-Hyo
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.23 no.4
    • /
    • pp.369-378
    • /
    • 2010
  • The mixed finite element analysis is the most widely used method for saturated porous media. Generally, in this method, direct method and iterative method are proposed to obtain unknown variable, however, the iterative method is recommended because the method provide numerical stability and accuracy under the material properties for solid and fluid are different. In this paper, we introduce staggered method which has strong numerical stability, and FETI(Finite Element Tearing and Interconnecting) which is one of decomposition methods are applied into the method in order to obtain numerical efficiency. In which, Lagrange Multipliers and conjugated gradient method to solve decomposed domain are proposed, and then, the proposed method is verified numerical efficiency by point to point MPI(Message Passing Interface) library.

Development of Real time Air Quality Prediction System

  • Oh, Jai-Ho;Kim, Tae-Kook;Park, Hung-Mok;Kim, Young-Tae
    • Proceedings of the Korean Environmental Sciences Society Conference
    • /
    • 2003.11a
    • /
    • pp.73-78
    • /
    • 2003
  • In this research, we implement Realtime Air Diffusion Prediction System which is a parallel Fortran model running on distributed-memory parallel computers. The system is designed for air diffusion simulations with four-dimensional data assimilation. For regional air quality forecasting a series of dynamic downscaling technique is adopted using the NCAR/Penn. State MM5 model which is an atmospheric model. The realtime initial data have been provided daily from the KMA (Korean Meteorological Administration) global spectral model output. It takes huge resources of computation to get 24 hour air quality forecast with this four step dynamic downscaling (27km, 9km, 3km, and lkm). Parallel implementation of the realtime system is imperative to achieve increased throughput since the realtime system have to be performed which correct timing behavior and the sequential code requires a large amount of CPU time for typical simulations. The parallel system uses MPI (Message Passing Interface), a standard library to support high-level routines for message passing. We validate the parallel model by comparing it with the sequential model. For realtime running, we implement a cluster computer which is a distributed-memory parallel computer that links high-performance PCs with high-speed interconnection networks. We use 32 2-CPU nodes and a Myrinet network for the cluster. Since cluster computers more cost effective than conventional distributed parallel computers, we can build a dedicated realtime computer. The system also includes web based Gill (Graphic User Interface) for convenient system management and performance monitoring so that end-users can restart the system easily when the system faults. Performance of the parallel model is analyzed by comparing its execution time with the sequential model, and by calculating communication overhead and load imbalance, which are common problems in parallel processing. Performance analysis is carried out on our cluster which has 32 2-CPU nodes.

  • PDF

Distributed Mean Field Genetic Algorithm for Channel Routing (채널배선 문제에 대한 분산 평균장 유전자 알고리즘)

  • Hong, Chul-Eui
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.2
    • /
    • pp.287-295
    • /
    • 2010
  • In this paper, we introduce a novel approach to optimization algorithm which is a distributed Mean field Genetic algorithm (MGA) implemented in MPI(Message Passing Interface) environments. Distributed MGA is a hybrid algorithm of Mean Field Annealing(MFA) and Simulated annealing-like Genetic Algorithm(SGA). The proposed distributed MGA combines the benefit of rapid convergence property of MFA and the effective genetic operations of SGA. The proposed distributed MGA is applied to the channel routing problem, which is an important issue in the automatic layout design of VLSI circuits. Our experimental results show that the composition of heuristic methods improves the performance over GA alone in terms of mean execution time. It is also proved that the proposed distributed algorithm maintains the convergence properties of sequential algorithm while it achieves almost linear speedup as the problem size increases.

Acceleration of Anisotropic Elastic Reverse-time Migration with GPUs (GPU를 이용한 이방성 탄성 거꿀 참반사 보정의 계산가속)

  • Choi, Hyungwook;Seol, Soon Jee;Byun, Joongmoo
    • Geophysics and Geophysical Exploration
    • /
    • v.18 no.2
    • /
    • pp.74-84
    • /
    • 2015
  • To yield physically meaningful images through elastic reverse-time migration, the wavefield separation which extracts P- and S-waves from reconstructed vector wavefields by using elastic wave equation is prerequisite. For expanding the application of the elastic reverse-time migration to anisotropic media, not only the anisotropic modelling algorithm but also the anisotropic wavefield separation is essential. The anisotropic wavefield separation which uses pseudo-derivative filters determined according to vertical velocities and anisotropic parameters of elastic media differs from the Helmholtz decomposition which is conventionally used for the isotropic wavefield separation. Since applying these pseudo-derivative filter consumes high computational costs, we have developed the efficient anisotropic wavefield separation algorithm which has capability of parallel computing by using GPUs (Graphic Processing Units). In addition, the highly efficient anisotropic elastic reverse-time migration algorithm using MPI (Message-Passing Interface) and incorporating the developed anisotropic wavefield separation algorithm with GPUs has been developed. To verify the efficiency and the validity of the developed anisotropic elastic reverse-time migration algorithm, a VTI elastic model based on Marmousi-II was built. A synthetic multicomponent seismic data set was created using this VTI elastic model. The computational speed of migration was dramatically enhanced by using GPUs and MPI and the accuracy of image was also improved because of the adoption of the anisotropic wavefield separation.

Parallel Flood Inundation Analysis using MPI Technique (MPI 기법을 이용한 병렬 홍수침수해석)

  • Park, Jae Hong
    • Journal of Korea Water Resources Association
    • /
    • v.47 no.11
    • /
    • pp.1051-1060
    • /
    • 2014
  • This study is attempted to realize an improved computation performance by combining the MPI (Message Passing Interface) Technique, a standard model of the parallel programming in the distributed memory environment, with the DHM(Diffusion Hydrodynamic Model), a inundation analysis model. With parallelizing inundation model, it compared with the existing calculation method about the results of applications to complicate and required long computing time problems. In addition, it attempted to prove the capability to estimate inundation extent, depth and speed-up computing time due to the flooding in protected lowlands and to validate the applicability of the parallel model to the actual flooding analysis by simulating based on various inundation scenarios. To verify the model developed in this study, it was applied to a hypothetical two-dimensional protected land and a real flooding case, and then actually verified the applicability of this model. As a result of this application, this model shows that the improvement effectiveness of calculation time is better up to the maximum of about 41% to 48% in using multi cores than a single core based on the same accuracy. The flood analysis model using the parallel technique in this study can be used for calculating flooding water depth, flooding areas, propagation speed of flooding waves, etc. with a shorter runtime with applying multi cores, and is expected to be actually used for promptly predicting real time flood forecasting and for drawing flood risk maps etc.

Numerical Prediction of Incompressible Flows Using a Multi-Block Finite Volume Method on a Parellel Computer (병렬 컴퓨터에서 다중블록 유한체적법을 이용한 비압축성 유동해석)

  • Kang, Dong-Jin;Sohn, Jeong-Lak
    • The KSFM Journal of Fluid Machinery
    • /
    • v.1 no.1 s.1
    • /
    • pp.72-80
    • /
    • 1998
  • Computational analysis of incompressible flows by numerically solving Navier-Stokes equations using multi-block finite volume method is conducted on a parallel computing system. Numerical algorithms adopted in this study $include^{(1)}$ QUICK upwinding scheme for convective $terms,^{(2)}$ central differencing for other terms $and^{(3)}$ the second-order Euler differencing for time-marching procedure. Structured grids are used on the body-fitted coordinate with multi-block concept which uses overlaid grids on the block-interfacing boundaries. Computational code is parallelized on the MPI environment. Numerical accuracy of the computational method is verified by solving a benchmark test case of the flow inside two-dimensional rectangular cavity. Computation in the axial compressor cascade is conducted by using 4 PE's md, as results, no numerical instabilities are observed and it is expected that the present computational method can be applied to the turbomachinery flow problems without major difficulties.

  • PDF

Hybrid Parallelization for High Performance of CFD_NIMR Model (기상 모델 CFD_NIMR의 최적 성능을 위한 혼합형 병렬 프로그램 구현)

  • Kim, Min-Wook;Choi, Young-Jean;Kim, Young-Tae
    • Atmosphere
    • /
    • v.22 no.1
    • /
    • pp.109-115
    • /
    • 2012
  • We parallelized the CFD_NIMR model, which is a numerical meteorological model, for best performance on both of distributed and shared memory parallel computers. This hybrid parallelization uses MPI (Message Passing Interface) to apply horizontal 2-dimensional sub-domain out of the 3-dimensional computing domain for distributed memory system, as well as uses OpenMP (Open Multi-Processing) to apply vertical 1-dimensional sub-domain for utilizing advantage of shared memory structure. We validated the parallel model with the original sequential model, and the parallel CFD_NIMR model shows efficient speedup on the distributed and shared memory system.

Large Eddy Simulation of the Vortex Breakdown in Swirling Flow Using MPI Parallel Technique

  • Sung Hong-Gye;Yang Vigor
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2000.05a
    • /
    • pp.107-112
    • /
    • 2000
  • 연소실 안으로 분출되는 스월 유동의 vortex breakdown mechanism 에 대한 연구를 하였다. 3 차원 유한 체적기법과 Runge-Kutta 시간 적분법이 적용되었으며, 난류모델은 dynamic large eddy simulation (DLES)이 적용되었다. 계산 시간의 효율성과 기억용량을 효과적으로 사용하기 위하여 message passing interface (MPI) 병렬계산 기법이 적용되었다. 스월 난류 유동에 있어서 vortex breakdown 거동을 가시적으로 표착 하였는데. 이는 스월 유동에 의한 난류 응력 증대, 난류 생성/소산율 증대 및 혼합율 증대에 대한 실험적 근거를 뒷받침하는 매우 중요한 결과이다. 또한 평균 속도와 레이놀스 응력에 대한 계산 결과도 실험 결과와 비교하였다.

  • PDF

Numerical discrepancy between serial and MPI parallel computations

  • Lee, Sang Bong
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.8 no.5
    • /
    • pp.434-441
    • /
    • 2016
  • Numerical simulations of 1D Burgers equation and 2D sloshing problem were carried out to study numerical discrepancy between serial and parallel computations. The numerical domain was decomposed into 2 and 4 subdomains for parallel computations with message passing interface. The numerical solution of Burgers equation disclosed that fully explicit boundary conditions used on subdomains of parallel computation was responsible for the numerical discrepancy of transient solution between serial and parallel computations. Two dimensional sloshing problems in a rectangular domain were solved using OpenFOAM. After a lapse of initial transient time sloshing patterns of water were significantly different in serial and parallel computations although the same numerical conditions were given. Based on the histograms of pressure measured at two points near the wall the statistical characteristics of numerical solution was not affected by the number of subdomains as much as the transient solution was dependent on the number of subdomains.

Performance Optimization of Parallel Algorithms

  • Hudik, Martin;Hodon, Michal
    • Journal of Communications and Networks
    • /
    • v.16 no.4
    • /
    • pp.436-446
    • /
    • 2014
  • The high intensity of research and modeling in fields of mathematics, physics, biology and chemistry requires new computing resources. For the big computational complexity of such tasks computing time is large and costly. The most efficient way to increase efficiency is to adopt parallel principles. Purpose of this paper is to present the issue of parallel computing with emphasis on the analysis of parallel systems, the impact of communication delays on their efficiency and on overall execution time. Paper focuses is on finite algorithms for solving systems of linear equations, namely the matrix manipulation (Gauss elimination method, GEM). Algorithms are designed for architectures with shared memory (open multiprocessing, openMP), distributed-memory (message passing interface, MPI) and for their combination (MPI + openMP). The properties of the algorithms were analytically determined and they were experimentally verified. The conclusions are drawn for theory and practice.