• Title/Summary/Keyword: Parallel computing model

Search Result 171, Processing Time 0.023 seconds

Integer-Pel Motion Estimation for HEVC on Compute Unified Device Architecture (CUDA)

  • Lee, Dongkyu;Sim, Donggyu;Oh, Seoung-Jun
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.3 no.6
    • /
    • pp.397-403
    • /
    • 2014
  • A new video compression standard called High Efficiency Video Coding (HEVC) has recently been released onto the market. HEVC provides higher coding performance compared to previous standards, but at the cost of a significant increase in encoding complexity, particularly in motion estimation (ME). At the same time, the computing capabilities of Graphics Processing Units (GPUs) have become more powerful. This paper proposes a parallel integer-pel ME (IME) algorithm for HEVC on GPU using the Compute Unified Device Architecture (CUDA). In the proposed IME, concurrent parallel reduction (CPR) is introduced. CPR performs several parallel reduction (PR) operations concurrently to solve two problems in conventional PR; low thread utilization and high thread synchronization latency. The proposed encoder reduces the portion of IME in the encoder to almost zero with a 2.3% increase in bitrate. In terms of IME, the proposed IME is up to 172.6 times faster than the IME in the HEVC reference model.

A dynamic analysis algorithm for RC frames using parallel GPU strategies

  • Li, Hongyu;Li, Zuohua;Teng, Jun
    • Computers and Concrete
    • /
    • v.18 no.5
    • /
    • pp.1019-1039
    • /
    • 2016
  • In this paper, a parallel algorithm of nonlinear dynamic analysis of three-dimensional (3D) reinforced concrete (RC) frame structures based on the platform of graphics processing unit (GPU) is proposed. Time integration is performed using Newmark method for nonlinear implicit dynamic analysis and parallelization strategies are presented. Correspondingly, a parallel Preconditioned Conjugate Gradients (PCG) solver on GPU is introduced for repeating solution of the equilibrium equations for each time step. The RC frames were simulated using fiber beam model to capture nonlinear behaviors of concrete and reinforcing bars. The parallel finite element program is developed utilizing Compute Unified Device Architecture (CUDA). The accuracy of the GPU-based parallel program including single precision and double precision was verified in comparison with ABAQUS. The numerical results demonstrated that the proposed algorithm can take full advantage of the parallel architecture of the GPU, and achieve the goal of speeding up the computation compared with CPU.

Evaluation of DES key search stability using Parallel Computing (병렬 컴퓨팅을 이용한 DES 키 탐색 안정성 분석)

  • Yoon, JunWeon;Choi, JangWon;Park, ChanYeol;Kong, Ki-Sik
    • Journal of Digital Contents Society
    • /
    • v.14 no.1
    • /
    • pp.65-72
    • /
    • 2013
  • Current and future parallel computing model has been suggested for running and solving large-scale application problems such as climate, bio, cryptology, and astronomy, etc. Parallel computing is a form of computation in which many calculations are carried out simultaneously. And we are able to shorten the execution time of the program, as well as can extend the scale of the problem that can be solved. In this paper, we perform the actual cryptographic algorithms through parallel processing and evaluate its efficiency. Length of the key, which is stable criterion of cryptographic algorithm, judged according to the amount of complete enumeration computation. So we present a detailed procedure of DES key search cryptographic algorithms for executing of enumeration computation in parallel processing environment. And then, we did the simulation through applying to clustering system. As a result, we can measure the safety and solidity of cryptographic algorithm.

Developing a Simulator of the Capture Process in Towed Fishing Gears by Chaotic Fish Behavior Model and Parallel Computing

  • Kim Yong-Hae;Ha Seok-Wun;Jun Yong-Kee
    • Fisheries and Aquatic Sciences
    • /
    • v.7 no.3
    • /
    • pp.163-170
    • /
    • 2004
  • A fishing simulator for towed fishing gear was investigated in order to mimic the fish behavior in capture process and investigate fishing selectivity. A fish behavior model using a psycho-hydraulic wheel activated by stimuli is established to introduce Lorenz chaos equations and a neural network system and to generate the components of realistic fish capture processes. The fish positions within the specified gear geometry are calculated from normalized intensities of the stimuli of the fishing gear components or neighboring fish and then these are related to the sensitivities and the abilities of the fish. This study is applied to four different towed gears i.e. a bottom trawl, a midwater trawl, a two-boat seine, and an anchovy boat seine and for 17 fish species as mainly caught. The Alpha cluster computer system and Fortran MPI (Message-Passing Interface) parallel programming were used for rapid calculation and mass data processing in this chaotic behavior model. The results of the simulation can be represented as animation of fish movements in relation to fishing gear using Open-GL and C graphic programming and catch data as well as selectivity analysis. The results of this simulator mimicked closely the field studies of the same gears and can therefore be used in further study of fishing gear design, predicting selectivity and indoor training systems.

Numerical procedures for extreme impulsive loading on high strength concrete structures

  • Danielson, Kent T.;Adley, Mark D.;O'Daniel, James L.
    • Computers and Concrete
    • /
    • v.7 no.2
    • /
    • pp.159-167
    • /
    • 2010
  • This paper demonstrates numerical techniques for complex large-scale modeling with microplane constitutive theories for reinforced high strength concrete, which for these applications, is defined to be around the 7000 psi (48 MPa) strength as frequently found in protective structural design. Applications involve highly impulsive loads, such as an explosive detonation or impact-penetration event. These capabilities were implemented into the authors' finite element code, ParaAble and the PRONTO 3D code from Sandia National Laboratories. All materials are explicitly modeled with eight-noded hexahedral elements. The concrete is modeled with a microplane constitutive theory, the reinforcing steel is modeled with the Johnson-Cook model, and the high explosive material is modeled with a JWL equation of state and a programmed burn model. Damage evolution, which can be used for erosion of elements and/or for post-analysis examination of damage, is extracted from the microplane predictions and computed by a modified Holmquist-Johnson-Cook approach that relates damage to levels of inelastic strain increment and pressure. Computation is performed with MPI on parallel processors. Several practical analyses demonstrate that large-scale analyses of this type can be reasonably run on large parallel computing systems.

Hybrid Parallelization for High Performance of CFD_NIMR Model (기상 모델 CFD_NIMR의 최적 성능을 위한 혼합형 병렬 프로그램 구현)

  • Kim, Min-Wook;Choi, Young-Jean;Kim, Young-Tae
    • Atmosphere
    • /
    • v.22 no.1
    • /
    • pp.109-115
    • /
    • 2012
  • We parallelized the CFD_NIMR model, which is a numerical meteorological model, for best performance on both of distributed and shared memory parallel computers. This hybrid parallelization uses MPI (Message Passing Interface) to apply horizontal 2-dimensional sub-domain out of the 3-dimensional computing domain for distributed memory system, as well as uses OpenMP (Open Multi-Processing) to apply vertical 1-dimensional sub-domain for utilizing advantage of shared memory structure. We validated the parallel model with the original sequential model, and the parallel CFD_NIMR model shows efficient speedup on the distributed and shared memory system.

High-speed simulation for fossil power plants uisng a parallel DSP system (병렬 DSP 시스템을 이용한 화력발전소 고속 시뮬레이션)

  • 박희준;김병국
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.4
    • /
    • pp.38-49
    • /
    • 1998
  • A fossil power plant can be modeled by a lot of algebraic equations and differential equations. When we simulate a large, complicated fossil power plant by a computer such as workstation or PC, it takes much time until overall equations are completely calculated. Therefore, new processing systems which have high computing speed is ultimately needed for real-time or high-speed(faster than real-time) simulators. This paper presents an enhanced strategy in which high computing power can be provided by parallel processing of DSP processors with communication links. DSP system is designed for general purpose. Parallel DSP system can be easily expanded by just connecting new DSP modules to the system. General urpose DSP modules and a VME interface module was developed. New model and techniques for the task allocation are also presented which take into account the special characteristics of parallel I/O and computation. As a realistic cost function of task allocation, we suggested 'simulation period' which represents the period of simulation output intervals. Based on the development of parallel DSP system and realistic task allocation techniques, we cound achieve good efficiency of parallel processing and faster simulation speed than real-time.

  • PDF

Parallel processing in structural reliability

  • Pellissetti, M.F.
    • Structural Engineering and Mechanics
    • /
    • v.32 no.1
    • /
    • pp.95-126
    • /
    • 2009
  • The present contribution addresses the parallelization of advanced simulation methods for structural reliability analysis, which have recently been developed for large-scale structures with a high number of uncertain parameters. In particular, the Line Sampling method and the Subset Simulation method are considered. The proposed parallel algorithms exploit the parallelism associated with the possibility to simultaneously perform independent FE analyses. For the Line Sampling method a parallelization scheme is proposed both for the actual sampling process, and for the statistical gradient estimation method used to identify the so-called important direction of the Line Sampling scheme. Two parallelization strategies are investigated for the Subset Simulation method: the first one consists in the embarrassingly parallel advancement of distinct Markov chains; in this case the speedup is bounded by the number of chains advanced simultaneously. The second parallel Subset Simulation algorithm utilizes the concept of speculative computing. Speedup measurements in context with the FE model of a multistory building (24,000 DOFs) show the reduction of the wall-clock time to a very viable amount (<10 minutes for Line Sampling and ${\approx}$ 1 hour for Subset Simulation). The measurements, conducted on clusters of multi-core nodes, also indicate a strong sensitivity of the parallel performance to the load level of the nodes, in terms of the number of simultaneously used cores. This performance degradation is related to memory bottlenecks during the modal analysis required during each FE analysis.

Analytical fragility curves of a structure subject to tsunami waves using smooth particle hydrodynamics

  • Sihombing, Fritz;Torbol, Marco
    • Smart Structures and Systems
    • /
    • v.18 no.6
    • /
    • pp.1145-1167
    • /
    • 2016
  • This study presents a new method to computes analytical fragility curves of a structure subject to tsunami waves. The method uses dynamic analysis at each stage of the computation. First, the smooth particle hydrodynamics (SPH) model simulates the propagation of the tsunami waves from shallow water to their impact on the target structure. The advantage of SPH over mesh based methods is its capability to model wave surface interaction when large deformations are involved, such as the impact of water on a structure. Although SPH is computationally more expensive than mesh based method, nowadays the advent of parallel computing on general purpose graphic processing unit overcome this limitation. Then, the impact force is applied to a finite element model of the structure and its dynamic non-linear response is computed. When a data-set of tsunami waves is used analytical fragility curves can be computed. This study proves it is possible to obtain the response of a structure to a tsunami wave using state of the art dynamic models in every stage of the computation at an affordable cost.

Realtime Tide and Storm-Surge Computations for the Yellow Sea Using the Parallel Finite Element Model (병렬 유한요소 모형을 이용한 황해의 실시간 조석 및 태풍해일 산정)

  • Byun, Sang-Shin;Choi, Byung-Ho;Kim, Kyeong-Ok
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.12 no.1
    • /
    • pp.29-36
    • /
    • 2009
  • Realtime tide and storm-surge computations for the Yellow Sea were conducted using the Parallel Finite Element Model. For these computations a high resolution grid system was constructed with a minimum node interval of loom in Gyeonggi Bay. In the modeling, eight main tidal constituents were analyzed and their results agreed well with the observed data. The realtime tide computation with the eight main tidal constituents and the storm-surge simulation for Typhoon Sarah(1959) were also conducted using parallel computing system of MPI-based LINUX clusters. The result showed a good performance in simulating Typhoon Sarah and reducing the computation time.