• Title/Summary/Keyword: Domain decomposition and MPI

Search Result 31, Processing Time 0.026 seconds

Parallel Stratified and Rotating Turbulence Simulation based on MPI (MPI 기반의 병렬 성층${\cdot}$회전 난류 시뮬레이션)

  • Kim, Byung-Uck;Yang, Sung-Bong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.1
    • /
    • pp.57-64
    • /
    • 2000
  • We describe a parallel implementation for the large-eddy simulation(LES) of stratified and rotating turbulence based on MPI. The parallelization strategy is specified by eliminating the tridiagonal solver with explicit method and by domain decompositions for solving the poisson equation. In this simulation we have run on CRAY-T3E under the message passing platform MPI with a various domain decomposition and the scalability of this parallel code of LES are also presented. The result shows that we can gain up to 16 times faster speed up on 64 processors with xyz-directional domain decomposition and scalable up to $128{\times}128{\times}$ which processing time is almost similar to that of $40{\times}40{\times}40$ on a single processor machine with a sequential code.

  • PDF

The Mixed Finite Element Analysis for Saturated Porous Media using FETI Method

  • Lee, Kyung-Jae;Tak, Moon-Ho;Park, Tae-Hyo
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.23 no.6
    • /
    • pp.693-702
    • /
    • 2010
  • In this paper, FETI(Finite Element Tearing and Interconnecting) method is introduced in order to improve numerical efficiency of Staggered method. The porous media theory, the Staggered method and the FETI method are briefly introduced in this paper. In addition, we account for the MPI(Message Passing Interface) library for parallel analysis, and the proposed combined Staggered method with FETI method. Finally Lagrange multipliers and CG(Conjugate Gradient) algorithm to solve decomposed domain are proposed, and then the proposed method is verified to be numerically efficient by MPI library.

Construction of a CPU Cluster and Implementation of a 3-D Domain Decomposition Parallel FDTD Algorithm (CPU 클러스터 구축 및 3차원 공간분할 병렬 FDTD 알고리즘 구현)

  • Park, Sungmin;Chu, Kwang-Uk;Ju, Saehoon;Park, Yoon-Mi;Kim, Ki-Baek;Jung, Kyung-Young
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.25 no.3
    • /
    • pp.357-364
    • /
    • 2014
  • In this work, we construct a CPU cluster to implement a parallel finite-difference time domain(FDTD) algorithm for fast electromagnetic analyses. This parallel FDTD algorithm can reduce the computational time significantly and also analyze electrically larger structures, compared to a single FDTD counterpart. The parallel FDTD algorithm needs communication between neighboring processors, which is performed by the MPI(Message Passing Interface) library and a 3-D domain decomposition is employed to decrease the communication time between neighboring processors. Compared to a single-processor FDTD, the speed up factor of a-CPU-cluster-based parallel FDTD algorithm is investigated for the normal mode and the hypermode and finally analyze an electrically large concrete structure by the developed parallel algorithm.

Computation of dilute polymer solution flows using BCF-RBFN based method and domain decomposition technique

  • Tran, Canh-Dung;Phillips, David G.;Tran-Cong, Thanh
    • Korea-Australia Rheology Journal
    • /
    • v.21 no.1
    • /
    • pp.1-12
    • /
    • 2009
  • This paper reports the suitability of a domain decomposition technique for the hybrid simulation of dilute polymer solution flows using Eulerian Brownian dynamics and Radial Basis Function Networks (RBFN) based methods. The Brownian Configuration Fields (BCF) and RBFN method incorporates the features of the BCF scheme (which render both closed form constitutive equations and a particle tracking process unnecessary) and a mesh-less method (which eliminates element-based discretisation of domains). However, when dealing with large scale problems, there appear several difficulties: the high computational time associated with the Stochastic Simulation Technique (SST), and the ill-condition of the system matrix associated with the RBFN. One way to overcome these disadvantages is to use parallel domain decomposition (DD) techniques. This approach makes the BCF-RBFN method more suitable for large scale problems.

A Study on Effect of Domain-Decomposition Method on Parallel Efficiency in 2-D Flow Computations (2차원 유동장 해석에서 영역분할법에 따른 병렬효율성 검토)

  • Lee Sangyeul;Hur Nahmkeon
    • 한국전산유체공학회:학술대회논문집
    • /
    • 1998.11a
    • /
    • pp.147-152
    • /
    • 1998
  • 2-D flow fields are studied by using a shared memory parallel computer with a parallel flow analysis program which uses domain decomposition method and MPI library for data exchange at overlapped interface. Especially, effects of directional domain decomposition on parallel efficiency are studied for 2-D Lid-Driven cavity flow and flow through square cavity. It is known from the present study that domain decomposition along the main flow direction gives better parallel efficiency in 1-D partitioning than along the other direction. 2-D partitioning, however, is less sensitive to flow directions and gives good parallel efficiency for most of the cases considered.

  • PDF

Implementation and Performance Analysis of a Parallel SIMPLER Model Based on Domain Decomposition (영역 분할에 의한 SIMPLER 모델의 병렬화와 성능 분석)

  • Kwak Ho Sang;Lee Sangsan
    • Journal of computational fluids engineering
    • /
    • v.3 no.1
    • /
    • pp.22-29
    • /
    • 1998
  • Parallel implementation is conducted for a SIMPLER finite volume model. The present parallelism is based on domain decomposition and explicit message passing using MPI and SHMEM. Two parallel solvers to tridiagonal matrix equation are employed. The implementation is verified on the Cray T3E system for a benchmark problem of natural convection in a sidewall-heated cavity. The test results illustrate good scalability of the present parallel models. Performance issues are elaborated in view of convergence as well as conventional parallel overheads and single processor performance. The effectiveness of a localized matrix solution algorithm is demonstrated.

  • PDF

Three-dimensional Numerical Analysis of Detonation Wave Structures in a Square Tube (정사각관 내 데토네이션 파 구조의 삼차원 수치 해석)

  • Cho, Deok-Rae;Won, Su-Hee;Shin, Jae-Ryul;Lee, Soo-Han;Choi, Jeong-Yeol
    • Journal of the Korean Society of Propulsion Engineers
    • /
    • v.11 no.1
    • /
    • pp.1-10
    • /
    • 2007
  • Three dimensional structures of detonation waves propagating in a square tube were investigated using a high resolution CFD code coupled with a conservation equation of reaction progress variable and an one-step irreversible reaction. The code were parallelized based on domain decomposition technique using MPI library. The computations were carried on an in-house Windows cluster with AMD processors. Three-dimensional unsteady analysis results in the smoked-foil records caused by the instabilities of the detonation waves, which showed the rectangular and diagonal modes of detonation instabilities depending on the initial condition of disturbances and the spinning detonation for case of small reaction constant.

Comparison of Message Passing Interface and Hybrid Programming Models to Solve Pressure Equation in Distributed Memory System (분산 메모리 시스템에서 압력방정식의 해법을 위한 MPI와 Hybrid 병렬 기법의 비교)

  • Jeon, Byoung Jin;Choi, Hyoung Gwon
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.39 no.2
    • /
    • pp.191-197
    • /
    • 2015
  • The message passing interface (MPI) and hybrid programming models for the parallel computation of a pressure equation were compared in a distributed memory system. Both models were based on domain decomposition, and two numbers of the sub-domain were selected by considering the efficiency of the hybrid model. The parallel performances for various problem sizes were measured using up to 96 threads. It was found that in addition to the cache-memory size, the overhead of the MPI communication/OpenMP directives affected the parallel performance. For small problems, the parallel performance was low because the percentage of the overhead of the MPI communication/OpenMP directives increased as the number of threads increased, and MPI was better than the hybrid model because it had a smaller communication overhead. For large problems, the parallel performance was high because, in addition to the cache effect, the percentage of the communication overhead was relatively low compared to that for small problems, and the hybrid model was better than MPI because the communication overhead of MPI was more dominant than that of the OpenMP directives in the hybrid model.

Optimization of Parallel Code for Noise Prediction in an Axial Fan Using MPI One-Sided Communication (MPI 일방향통신을 이용한 축류 팬 주위 소음해석 병렬프로그램 최적화)

  • Kwon, Oh-Kyoung;Park, Keuntae;Choi, Haecheon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.7 no.3
    • /
    • pp.67-72
    • /
    • 2018
  • Recently, noise reduction in an axial fan producing the small pressure rise and large flow rate, which is one type of turbomachine, is recognized as essential. This study describes the design and optimization techniques of MPI parallel program to simulate the flow-induced noise in the axial fan. In order to simulate the code using 100 million number of grids for flow and 70,000 points for noise sources, we parallelize it using the 2D domain decomposition. However, when it is involved many computing cores, it is getting slower because of MPI communication overhead among nodes, especially for the noise simulation. Thus, it is adopted the one-sided communication to reduce the overhead of MPI communication. Moreover, the allocated memory and communication between cores are optimized, thereby improving 2.97x compared to the original one. Finally, it is achieved 12x and 6x faster using 6,144 and 128 computing cores of KISTI Tachyon2 than using 256 and 16 computing cores for the flow and noise simulations, respectively.