• 제목/요약/키워드: Parallel PCG Method

검색결과 10건 처리시간 0.022초

PARALLEL PERFORMANCE OF THE Gℓ-PCG METHOD FOR IMAGE DEBLURRING PROBLEMS

  • YUN, JAE HEON
    • Journal of applied mathematics & informatics
    • /
    • 제36권3_4호
    • /
    • pp.317-330
    • /
    • 2018
  • We first provide how to apply the global preconditioned conjugate gradient ($G{\ell}-PCG$) method with Kronecker product preconditioners to image deblurring problems with nearly separable point spread functions. We next provide a coarse-grained parallel image deblurring algorithm using the $G{\ell}-PCG$. Lastly, we provide numerical experiments for image deblurring problems to evaluate the effectiveness of the $G{\ell}-PCG$ with Kronecker product preconditioner by comparing its performance with those of the $G{\ell}-CG$, CGLS and preconditioned CGLS (PCGLS) methods.

PC level 병렬 구조해석법 개발을 위한 PCG 알고리즘 (PCG Algorithms for Development of PC level Parallel Structural Analysis Method)

  • 박효선;박성무;권윤한
    • 한국전산구조공학회:학술대회논문집
    • /
    • 한국전산구조공학회 1998년도 가을 학술발표회 논문집
    • /
    • pp.362-369
    • /
    • 1998
  • The computational environment in which engineers perform their designs has been rapidly evolved from coarse serial machines to massively parallel machines. Although the recent development of high-performance computers are available for a number of years, only limited successful applications of the new computational environments in computational structural engineering field has been reported due to its limited availability and large cost associated with high-performance computing. As a new computational model for high-performance engineering computing without cost and availability problems, parallel structural analysis models for large scale structures on a network of personal computers (PCs) are presented in this paper. In structural analysis solving routine for the linear system of equations is the most time consuming part. Thus, the focus is on the development of efficient preconditioned conjugate gradient (PCG) solvers on the proposed computational model. Two parallel PCG solvers, PPCG-I and PPCG-II, are developed and applied to analysis of large scale space truss structures.

  • PDF

A VARIANT OF BLOCK INCOMPLETE FACTORIZATION PRECONDITIONERS FOR A SYMMETRIC H-MATRIX

  • Yun, Jae-Heon;Kim, Sang-Wook
    • Journal of applied mathematics & informatics
    • /
    • 제8권3호
    • /
    • pp.705-720
    • /
    • 2001
  • We propose a variant of parallel block incomplete factorization preconditioners for a symmetric block-tridiagonal H-matrix. Theoretical properties of these block preconditioners are compared with those of block incomplete factoriztion preconditioners for the corresponding somparison matrix. Numerical results of the preconditioned CG(PCG) method using these block preconditioners are compared with those of PCG using other types of block incomplete factorization preconditioners. Lastly, parallel computations of the block incomplete factorization preconditioners are carried out on the Cray C90.

클러스터 시스템에서 3차원 강소성 유한요소법의 병렬처리 (Parallel Processing of 3D Rigid-Plastic FEM on a Cluster System)

  • 최영;서용위
    • 한국정밀공학회지
    • /
    • 제22권1호
    • /
    • pp.122-129
    • /
    • 2005
  • On the cluster system, the parallel code of rigid-plastic FEM has been developed. The cluster system, Simforge, has 15 processors and the total memory is 4.5GBytes. In the developed parallel code, the distributed data of the column-wise partitioned stiffness are stored as the compressed row storage and the diagonal preconditioned conjugate gradient solver is applied. The analysis of block upsetting is performed with the parallel code on Simforge cluster system. In this paper, the analysis results are compared and discussed.

대형구조물의 분산구조해석을 위한 PCG 알고리즘 (Distributed Structural Analysis Algorithms for Large-Scale Structures based on PCG Algorithms)

  • 권윤한;박효선
    • 한국전산구조공학회논문집
    • /
    • 제12권3호
    • /
    • pp.385-396
    • /
    • 1999
  • 최근 공학분야에서 다루어지고 있는 문제의 규모가 대형화하고 있으며 이러한 대형구조물의 구조설계는 부재의 강도설계 및 절점의 변위조절을 위하여 많은 수의 구조해석을 요구한다. 한 대의 개인용 컴퓨터에 의한 대형구조물의 구조해석은 대용량의 기억장치와 많은 계산 시간이 요구되므로 반복적 해석이 필요한 대형구조물의 설계에 효율적으로 이용되기 어려운 실정이다. 따라서, 본 논문에서는 이러한 문제에 대한 대안으로 다수의 개인용 컴퓨터들을 네트워크로 연결하여 고성능 병렬연산시스템을 구성하고 이에 적합한 두 가지 형태의 분산구조방정식해법들을 반복법인 PCG 알고리즘을 이용하여 개발하였다. 대형구조물을 위한 분산구조해석법은 구조해석 과정에 요구되는 각 컴퓨터 상호 간의 통신회수와 통신량을 최소화할 수 있도록 개발되었다. 분산구조해석법의 성능은 대규모 3차원 트러스 구조물 및 144층 가새 튜브구조물의 구조해석에 적용하여 분석하였다.

  • PDF

대규모 자유도 문제의 구조해석을 위한 병렬 알고리즘 (A Parallel Algorithm for Large DOF Structural Analysis Problems)

  • 김민석;이지호
    • 한국전산구조공학회논문집
    • /
    • 제23권5호
    • /
    • pp.475-482
    • /
    • 2010
  • 본 논문에서는 대규모 자유도 시스템의 병렬처리를 위하여 2단계로 이루어진 영역분할법(Domain Decomposition Method) 기반의 병렬 알고리즘을 제안하였다. 분할된 영역의 내부 및 외부 경계를 상위영역문제로 정의하고 국부영역문제는 변위 경계조건이 모두 주어지는 분할영역에서의 Dirichlet 문제로 구성한다. 상위영역에서는 전체 상위영역에 대한 강성 행렬의 어셈블이 필요없는 반복법을 통하여 변위를 구하고, 이를 바탕으로 국부영역에서 Multi-Frontal Sparse Solver (MFSS)를 이용하여 변위를 계산한다. 상위영역문제의 연산에서 프로세서 간의 데이터 교환을 최소화하여 계산효율을 유지하며, 동시에 해석 가능한 자유도를 증대시키는 병렬 PCG(Preconditioned Conjugate Gradient)법 기반의 알고리즘을 개발하였다. 제안된 알고리즘을 적용하여 수치해석을 수행한 결과, 프로세서 수가 증가할수록 계산성능의 손실없이 해석 가능한 자유도가 비례하여 증가하는 선형 확장성을 관찰할 수 있었으며, 대규모 자유도 문제에 효과적으로 사용 가능함을 확인하였다.

A dynamic analysis algorithm for RC frames using parallel GPU strategies

  • Li, Hongyu;Li, Zuohua;Teng, Jun
    • Computers and Concrete
    • /
    • 제18권5호
    • /
    • pp.1019-1039
    • /
    • 2016
  • In this paper, a parallel algorithm of nonlinear dynamic analysis of three-dimensional (3D) reinforced concrete (RC) frame structures based on the platform of graphics processing unit (GPU) is proposed. Time integration is performed using Newmark method for nonlinear implicit dynamic analysis and parallelization strategies are presented. Correspondingly, a parallel Preconditioned Conjugate Gradients (PCG) solver on GPU is introduced for repeating solution of the equilibrium equations for each time step. The RC frames were simulated using fiber beam model to capture nonlinear behaviors of concrete and reinforcing bars. The parallel finite element program is developed utilizing Compute Unified Device Architecture (CUDA). The accuracy of the GPU-based parallel program including single precision and double precision was verified in comparison with ABAQUS. The numerical results demonstrated that the proposed algorithm can take full advantage of the parallel architecture of the GPU, and achieve the goal of speeding up the computation compared with CPU.

분산 메모리 시스템에서의 병렬 위상 최적설계 (Parallel Topology Optimization on Distributed Memory System)

  • 이기명;조선호
    • 한국전산구조공학회:학술대회논문집
    • /
    • 한국전산구조공학회 2006년도 정기 학술대회 논문집
    • /
    • pp.291-298
    • /
    • 2006
  • A parallelized topology design optimization method is developed on a distributed memory system. The parallelization is based on a domain decomposition method and a boundary communication scheme. For the finite element analysis of structural responses and design sensitivities, the PCG method based on a Krylov iterative scheme is employed. Also a parallelized optimization method of optimality criteria is used to solve large-scale topology optimization problems. Through several numerical examples, the developed method shows efficient and acceptable topology optimization results for the large-scale problems.

  • PDF

인력선 프레임의 병렬화 위상 최적설계 (Parallelized Topology Design Optimization of the Frame of Human Powered Vessel)

  • 김현석;이기명;김민근;조선호
    • 대한조선학회논문집
    • /
    • 제47권1호
    • /
    • pp.58-66
    • /
    • 2010
  • Topology design optimization is a method to determine the optimal distribution of material that yields the minimal compliance of structures, satisfying the constraint of allowable material volume. The method is easy to implement and widely used so that it becomes a powerful design tool in various disciplines. In this paper, a large-scale topology design optimization method is developed using the efficient adjoint sensitivity and optimality criteria methods. Parallel computing technique is required for the efficient topology optimization as well as the precise analysis of large-scale problems. Parallelized finite element analysis consists of the domain decomposition and the boundary communication. The preconditioned conjugate gradient method is employed for the analysis of decomposed sub-domains. The developed parallel computing method in topology optimization is utilized to determine the optimal structural layout of human powered vessel.

Assessment of computational performance for a vector parallel implementation: 3D probabilistic model discrete cracking in concrete

  • Paz, Carmen N.M.;Alves, Jose L.D.;Ebecken, Nelson F.F.
    • Computers and Concrete
    • /
    • 제2권5호
    • /
    • pp.345-366
    • /
    • 2005
  • This work presents an assessment of the computational performance of a vector-parallel implementation of probabilistic model for concrete cracking in 3D. This paper shows the continuing efforts towards code optimization as reported in earlier works Paz, et al. (2002a,b and 2003). The probabilistic crack approach is based on the direct Monte Carlo method. Cracking is accounted by means of 3D interface elements. This approach considers that all nonlinearities are restricted to interface elements modeling cracks. The heterogeneity governs the overall cracking behavior and related size effects on concrete fracture. Computational kernels in the implementation are the inexact Newton iterative driver to solve the non-linear problem and a preconditioned conjugate gradient (PCG) driver to solve linearized equations, using an element by element (EBE) strategy to compute matrix-vector products. In particular the paper analyzes code behavior using OpenMP directives in parallel vector processors (PVP), such as the CRAY SV1 and CRAY T94. The impact of the memory architecture on code performance, and also some strategies devised to circumvent this issue are addressed by numerical experiment.