Search | Korea Science

Finite Element Analysis of Shape Rolling Process using Destributive Parallel Algorithms on Cray T3E (병렬 컴퓨터를 이용한 형상 압연공정 유한요소 해석의 분산병렬처리에 관한 연구)

Gwon, Gi-Chan;Yun, Seong-Gi
- Transactions of the Korean Society of Mechanical Engineers A
- /
- v.24 no.5 s.176
- /
- pp.1215-1230
- /
- 2000
Parallel Approaches using Cray T3E which is NIPP (Massively Parallel Processors) machine are presented for the efficient computation of the finite element analysis of 3-D shape rolling processes. D omain decomposition method coupled with parallel linear equation solver is used. Domain decomposition is applied for obtaining element tangent stifffiess matrices and residual vectors. Direct and iterative parallel algorithms are used for solving the linear equations. Direct algorithm is_parallel version of direct banded matrix solver. For iterative algorithms, the well-known preconditioned conjugate gradient solver with Jacobi preconditioner is also employed. Moreover a new effective iterative scheme with block inverse matrix preconditioner, which is named by present authors, is presented and its results are compared with the one using Jacobi preconditioner. PVM and MPI are used for message passing and synchronization between processors. The performance and efficiency of each algorithm is discussed and comparisons are made among different algorithms.
https://doi.org/10.22634/KSME-A.2000.24.5.1215 인용 PDF

AN ASYNCHRONOUS PARALLEL SOLVER FOR SOME MATRIX PROBLEMS

Park, Pil-Seong
- Journal of applied mathematics & informatics
- /
- v.7 no.3
- /
- pp.1045-1058
- /
- 2000
In usual synchronous parallel computing, workload balance is a crucial factor to reduce idle times of some processors that have finished their jobs earlier than others. However, it is difficult to achieve on a heterogeneous workstation clusters where the available computing power of each processor is unpredictable. As a way to overcome such a problem, the idea of asynchronous methods has grown out and is being increasingly used and studied, but there is none for eigenvalue problems yet. In this paper, we suggest a new asynchronous method to solve some singular matrix problems, that can also be used for finding a certain eigenvector of some matrices.

A Fast Poisson Solver of Second-Order Accuracy for Isolated Systems in Three-Dimensional Cartesian and Cylindrical Coordinates

Moon, Sanghyuk;Kim, Woong-Tae;Ostriker, Eve C.
- The Bulletin of The Korean Astronomical Society
- /
- v.44 no.1
- /
- pp.46.1-46.1
- /
- 2019
We present an accurate and efficient method to calculate the gravitational potential of an isolated system in three-dimensional Cartesian and cylindrical coordinates subject to vacuum (open) boundary conditions. Our method consists of two parts: an interior solver and a boundary solver. The interior solver adopts an eigenfunction expansion method together with a tridiagonal matrix solver to solve the Poisson equation subject to the zero boundary condition. The boundary solver employs James's method to calculate the boundary potential due to the screening charges required to keep the zero boundary condition for the interior solver. A full computation of gravitational potential requires running the interior solver twice and the boundary solver once. We develop a method to compute the discrete Green's function in cylindrical coordinates, which is an integral part of the James algorithm to maintain second-order accuracy. We implement our method in the {\tt Athena++} magnetohydrodynamics code, and perform various tests to check that our solver is second-order accurate and exhibits good parallel performance.
PDF

Implementation and Performance Analysis of a Parallel SIMPLER Model Based on Domain Decomposition (영역 분할에 의한 SIMPLER 모델의 병렬화와 성능 분석)

Kwak Ho Sang;Lee Sangsan
- Journal of computational fluids engineering
- /
- v.3 no.1
- /
- pp.22-29
- /
- 1998
Parallel implementation is conducted for a SIMPLER finite volume model. The present parallelism is based on domain decomposition and explicit message passing using MPI and SHMEM. Two parallel solvers to tridiagonal matrix equation are employed. The implementation is verified on the Cray T3E system for a benchmark problem of natural convection in a sidewall-heated cavity. The test results illustrate good scalability of the present parallel models. Performance issues are elaborated in view of convergence as well as conventional parallel overheads and single processor performance. The effectiveness of a localized matrix solution algorithm is demonstrated.
PDF

3-D Traveltime and Amplitude Calculation using High-performance Parallel Finite-element Solver (고성능 병렬 유한요소 솔버를 이용한 3차원 주시와 진폭계산)

Yang, Dong-Woo;Kim, Jung-Ho
- Geophysics and Geophysical Exploration
- /
- v.7 no.4
- /
- pp.234-244
- /
- 2004
In order to calculate 3-dimensional wavefield using finite-element method in frequency domain, we must factor so huge sparse impedance matrix. Because of difficulties of handling of this huge impedance matrix, 3-dimensional wave equation modeling is conducted mainly in time domain. In this study, we simulate the 3-D wavefield using finite-element method in Laplace domain by combining high-performance parallel finite-element solver and SWEET (Suppressed Wave Equation Estimation of Traveltime) algorithm which can calculate the traveltime and the amplitude. To verify this combination, we applied it to the SEG/EAGE 3D salt model in serial and parallel computing environments.
PDF KSCI

Parallel Algorithm of Conjugate Gradient Solver using OpenGL Compute Shader

Va, Hongly;Lee, Do-keyong;Hong, Min
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.1
- /
- pp.1-9
- /
- 2021
OpenGL compute shader is a shader stage that operate differently from other shader stage and it can be used for the calculating purpose of any data in parallel. This paper proposes a GPU-based parallel algorithm for computing sparse linear systems through conjugate gradient using an iterative method, which perform calculation on OpenGL compute shader. Basically, this sparse linear solver is used to solve large linear systems such as symmetric positive definite matrix. Four well-known matrix formats (Dense, COO, ELL and CSR) have been used for matrix storage. The performance comparison from our experimental tests using eight sparse matrices shows that GPU-based linear solving system much faster than CPU-based linear solving system with the best average computing time 0.64ms in GPU-based and 15.37ms in CPU-based.
https://doi.org/10.9708/jksci.2021.26.01.001 인용 PDF KSCI HTML

PERFORMANCE ENHANCEMENT OF PARALLEL MULTIFRONTAL SOLVER ON BLOCK LANCZOS METHOD

Byun, Wan-Il;Kim, Seung-Jo
- Journal of the Korean Society for Industrial and Applied Mathematics
- /
- v.13 no.1
- /
- pp.13-20
- /
- 2009
The IPSAP which is a finite element analysis program has been developed for high parallel performance computing. This program consists of various analysis modules - stress, vibration and thermal analysis module, etc. The M orthogonal block Lanczos algorithm with shiftinvert transformation is used for solving eigenvalue problems in the vibration module. And the multifrontal algorithm which is one of the most efficient direct linear equation solvers is applied to factorization and triangular system solving phases in this block Lanczos iteration routine. In this study, the performance enhancement procedures of the IPSAP are composed of the following stages: 1) communication volume minimization of the factorization phase by modifying parallel matrix subroutines. 2) idling time minimization in triangular system solving phase by partial inverse of the frontal matrix and the LCM (least common multiple) concept.
PDF

A Scalable Parallel Preconditioner on the CRAY-T3E for Large Nonsymmetric Spares Linear Systems (대형비대칭 이산행렬의 CRAY-T3E에서의 해법을 위한 확장가능한 병렬준비행렬)

Ma, Sang-Baek
- The KIPS Transactions:PartA
- /
- v.8A no.3
- /
- pp.227-234
- /
- 2001
In this paper we propose a block-type parallel preconditioner for solving large sparse nonsymmetric linear systems, which we expect to be scalable. It is Multi-Color Block SOR preconditioner, combined with direct sparse matrix solver. For the Laplacian matrix the SOR method is known to have a nondeteriorating rate of convergence when used with Multi-Color ordering. Since most of the time is spent on the diagonal inversion, which is done on each processor, we expect it to be a good scalable preconditioner. We compared it with four other preconditioners, which are ILU(0)-wavefront ordering, ILU(0)-Multi-Color ordering, SPAI(SParse Approximate Inverse), and SSOR preconditiner. Experiments were conducted for the Finite Difference discretizations of two problems with various meshsizes varying up to $1025{\times}1024$. CRAY-T3E with 128 nodes was used. MPI library was used for interprocess communications, The results show that Multi-Color Block SOR is scalabl and gives the best performances.
PDF

Parallel Processing of 3D Rigid-Plastic FEM on a Cluster System (클러스터 시스템에서 3차원 강소성 유한요소법의 병렬처리)

Choi Young;Seo Yongwie
- Journal of the Korean Society for Precision Engineering
- /
- v.22 no.1
- /
- pp.122-129
- /
- 2005
On the cluster system, the parallel code of rigid-plastic FEM has been developed. The cluster system, Simforge, has 15 processors and the total memory is 4.5GBytes. In the developed parallel code, the distributed data of the column-wise partitioned stiffness are stored as the compressed row storage and the diagonal preconditioned conjugate gradient solver is applied. The analysis of block upsetting is performed with the parallel code on Simforge cluster system. In this paper, the analysis results are compared and discussed.
PDF KSCI

CUDA-based Parallel Bi-Conjugate Gradient Matrix Solver for BioFET Simulation (BioFET 시뮬레이션을 위한 CUDA 기반 병렬 Bi-CG 행렬 해법)

Park, Tae-Jung;Woo, Jun-Myung;Kim, Chang-Hun
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.48 no.1
- /
- pp.90-100
- /
- 2011
We present a parallel bi-conjugate gradient (Bi-CG) matrix solver for large scale Bio-FET simulations based on recent graphics processing units (GPUs) which can realize a large-scale parallel processing with very low cost. The proposed method is focused on solving the Poisson equation in a parallel way, which requires massive computational resources in not only semiconductor simulation, but also other various fields including computational fluid dynamics and heat transfer simulations. As a result, our solver is around 30 times faster than those with traditional methods based on single core CPU systems in solving the Possion equation in a 3D FDM (Finite Difference Method) scheme. The proposed method is implemented and tested based on NVIDIA's CUDA (Compute Unified Device Architecture) environment which enables general purpose parallel processing in GPUs. Unlike other similar GPU-based approaches which apply usually 32-bit single-precision floating point arithmetics, we use 64-bit double-precision operations for better convergence. Applications on the CUDA platform are rather easy to implement but very hard to get optimized performances. In this regard, we also discuss the optimization strategy of the proposed method.
PDF KSCI

Search Result 19, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)