• Title/Summary/Keyword: Parallel computing

Search Result 812, Processing Time 0.025 seconds

Numerical Simulation of Natural Convection in Annuli with Internal Fins

  • Ha, Man-Yeong;Kim, Joo-Goo
    • Journal of Mechanical Science and Technology
    • /
    • v.18 no.4
    • /
    • pp.718-730
    • /
    • 2004
  • The solution for the natural convection in internally finned horizontal annuli is obtained by using a numerical simulation of time-dependent and two-dimensional governing equations. The fins existing in annuli influence the flow pattern, temperature distribution and heat transfer rate. The variations of the On configuration suppress or accelerate the free convective effects compared to those of the smooth tubes. The effects of fin configuration, number of fins and ratio of annulus gap width to the inner cylinder radius on the fluid flow and heat transfer in annuli are demonstrated by the distribution of the velocity vector, isotherms and streamlines. The governing equations are solved efficiently by using a parallel implementation. The technique is adopted for reduction of the computation cost. The parallelization is performed with the domain decomposition technique and message passing between sub-domains on the basis of the MPI library. The results from parallel computation reveal in consistency with those of the sequential program. Moreover, the speed-up ratio shows linearity with the number of processor.

On The Parallel Inplementation of a Static/Explicit FEM Program for Sheet Metal Forming (판금형 해석을 위한 정적/외연적 유한요소 프로그램의 병령화에 관한 연구)

  • ;;G.P.Nikishikov
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1995.10a
    • /
    • pp.625-628
    • /
    • 1995
  • A static/implicit finite element code for sheet forming (ITAS3D) is parallelized on IBM SP 6000 multi-processor computer. Computing-load-balanced domain decomposition method and the direct solution method at each subdomain (and interface) equation are developed. The system of equations for each subdomain are constructed by condensation and calculated on each processor. Approximated operation counts are calculated to set up the nonlinear equation system for balancing the compute load on each subdomain. Th esquare cup tests with several numbers of elements are used in demonstrating the performance of this parallel implementation. This procedure are proved to be efficient for moderate number of processors, especially for large number of elements.

  • PDF

Design and Performance Analysis of the H/V-bus Parallel Computer (H/V-버스 병렬컴퓨터의 설계 및 성능 분석)

  • 김종현
    • Journal of the Korea Society for Simulation
    • /
    • v.3 no.1
    • /
    • pp.29-42
    • /
    • 1994
  • The architecture of a MIMD-type parallel computer system is specified: a simulator is developed to support design and evaluation of systems based on the architecture: and conducted with the simulator to evaluate system performance. The horizontal/vertical-bus(H/V-bus) system architecture provides an NxN array of processing elements which communicate with each other through a network of N horizontal buses and N vertical buses. The simulator, written in SLAM II and FORTRAN, is designed to provide high-resolution in simulating the IPC mechanism. Parameters provide the user with independent control of system size, PE speed and IPC mechanism speed. Results generated by the simulator include execution times, PE utilizations, queue lengths, and other data. The simulator is used to study system performance when a partial differential equation is solved by parallel Gauss-Seidel method. For comparisons, the benchmark is also executed on a single-bus system simulator that is derived from the H/V-bus system simulator. The benchmark is also solved on a single PE to obtain data for computing speedups. An extensive analysis of results is presented.

  • PDF

Accelerating Fingerprint Enhancement Algorithm on GPGPU using OpenCL (OpenCL을 이용한 GPGPU 기반 지문개선 알고리즘 가속화)

  • Kim, Daehee;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.4
    • /
    • pp.666-672
    • /
    • 2016
  • Recently the fingerprint is widely used as one of biometrics to improve the security of financial mobile applications, because of its user convenience and high recognition rate. However, in order to apply fingerprint algorithms to finance and security applications, the recognition rate and processing speed of the fingerprint algorithms have to be improved further. In this paper, we propose the parallel fingerprint enhancement algorithm on general-purpose computing on graphics processing unit (GPGPU) using OpenCL. We discuss the analysis of the parallelism in the fingerprint algorithm as well as the exploration of optimization parameters of the parallel fingerprint algorithm to improve the performance. The experimental results showed that the execution of parallel fingerprint enhancement algorithm on GPGPUs was accelerated from 29.4 upto 69.2 times compared with the execution of the original one on the host CPUs.

Power System State Estimation Using Parallel PSO Algorithm based on PC cluster (PC 클러스터 기반 병렬 PSO 알고리즘을 이용한 전력계통의 상태추정)

  • Jeong, Hee-Myung;Park, June-Ho;Lee, Hwa-Seok
    • Proceedings of the KIEE Conference
    • /
    • 2008.07a
    • /
    • pp.303-304
    • /
    • 2008
  • For the state estimation problem, the weighted least squares (WLS) method and the fast decoupled method are widely used at present. However, these algorithms can converge to local optimal solutions. Recently, modern heuristic optimization methods such as Particle Swarm Optimization (PSO) have been introduced to overcome the disadvantage of the classical optimization problem. However, heuristic optimization methods based on populations require a lengthy computing time to find an optimal solution. In this paper, we used PSO to search for the optimal solution of state estimation in power systems. To overcome the shortcoming of heuristic optimization methods, we proposed parallel processing of the PSO algorithm based on the PC cluster system. the proposed approach was tested with the IEEE-118 bus systems. From the simulation results, we found that the parallel PSO based on the PC cluster system can be applicable for power system state estimation.

  • PDF

A STUDY OF THE APPLICATION OF DELAUNAY GRID GENERATION ON GPU USING CUDA LIBRARY (GPU Library CUDA를 이용한 효율적인 Delaunay 격자 생성에 관한 연구)

  • Song, J.H.;Kang, S.H.;Kim, G.M.;Kim, B.S.
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2011.05a
    • /
    • pp.194-198
    • /
    • 2011
  • In this study, an efficient algorithm for Delaunay triangulation of a number of points which can be used on a GPU-based parallel computation is studied The developed algorithm is programmed using CUDA library. and the program takes full advantage of parallel computation which are concurrently performed on each of the threads on GPU. The results of partitioned triangulation collected from the GPU computation requires proper stitching between neighboring partitions and calculation of connectivities among triangular cells on CPU In this study, the effect of number of threads on the efficiency and total duration for Delaunay grid generation is studied. And it is also shown that GPU computing using CUDA for Delaunay grid generation is feasible and it saves total time required for the triangulation of the large number points compared to the sequential CPU-based triangulation programs.

  • PDF

A Numerical Study for the Three-Dimensional Fluid Flow Past Tube Banks and Comparison with PIV Experimental Data

  • Ha, Man-Yeong;Kim, Seung-Hyeon;Kim, Kyung-Chun;Son, Young-Chul
    • Journal of Mechanical Science and Technology
    • /
    • v.18 no.12
    • /
    • pp.2236-2249
    • /
    • 2004
  • The analysis for the three-dimensional fluid flow past tube banks arranged in equilateral-triangular form at Re$\_$max/=4,000 is carried out using a large eddy simulation technique. The governing equations for the mass and momentum conservation are discretized using the finite volume method. Parallel computational techniques using MPI (Message Passing Interface) are implemented in the present computer code. The computation time decreases linearly proportional to the number of used CPUs in the present parallel computation. We obtained the time-averaged streamwise and cross-streamwise velocities and turbulent intensities. The present numerical results are compared with the PIV experimental data and agree generally well with the experimental data.

Parallel Hybrid Particle-Continuum (DSMC-NS) Flow Simulations Using 3-D Unstructured Mesh

  • Wu J.S.;Lian Y.Y.;Cheng G.;Chen Y.S.
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2006.05a
    • /
    • pp.27-34
    • /
    • 2006
  • In this paper, a recently proposed parallel hybrid particle-continuum (DSMC-NS) scheme employing 3D unstructured grid for solving steady-state gas flows involving continuum and rarefied regions is described [1]. Substitution of a density-based NS solver to a pressure-based one that greatly enhances the capability of the proposed hybrid scheme and several practical experiences of implementation learned from the development and verifications are highlighted. At the end, we present some simulation results of a realistic RCS nozzle plume, which is considered very challenging using either a continuum or particle solver alone, to demonstrate the capability of the proposed hybrid DSMC-NS method.

  • PDF

High Performance Hybrid Direct-Iterative Solution Method for Large Scale Structural Analysis Problems

  • Kim, Min-Ki;Kim, Seung-Jo
    • International Journal of Aeronautical and Space Sciences
    • /
    • v.9 no.2
    • /
    • pp.79-86
    • /
    • 2008
  • High performance direct-iterative hybrid linear solver for large scale finite element problem is developed. Direct solution method is robust but difficult to parallelize, whereas iterative solution method is opposite for direct method. Therefore, combining two solution methods is desired to get both high performance parallel efficiency and numerical robustness for large scale structural analysis problems. Hybrid method mentioned in this paper is based on FETI-DP (Finite Element Tearing and Interconnecting-Dual Primal method) which has good parallel scalability and efficiency. It is suitable for fourth and second order finite element elliptic problems including structural analysis problems. We are using the hybrid concept of theses two solution method categories, combining the multifrontal solver into FETI-DP based iterative solver. Hybrid solver is implemented for our general structural analysis code, IPSAP.

Parallel Genetic Algorithm for Structural Optimization on a Cluster of Personal Computers (구조최적화를 위한 병렬유전자 알고리즘)

  • 이준호;박효선
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2000.10a
    • /
    • pp.40-47
    • /
    • 2000
  • One of the drawbacks of GA-based structural optimization is that the fitness evaluation of a population of hundreds of individuals requiring hundreds of structural analyses at each CA generation is computational too expensive. Therefore, a parallel genetic algorithm is developed for structural optimization on a cluster of personal computers in this paper. Based on the parallel genetic algorithm, a population at every generation is partitioned into a number of sub-populations equal to the number of slave computers. Parallelism is exploited at sub-population level by allocationg each sub-population to a slave computer. Thus, fitness of a population at each generation can be concurrently evaluated on a cluster of personal computers. For implementation of the algorithm a virtual distributed computing system in a collection of personal computers connected via a 100 Mb/s Ethernet LAN. The algorithm is applied to the minimum weight design of a steel structure. The results show that the computational time requied for serial GA-based structural optimization process is drastically reduced.

  • PDF