• Title/Summary/Keyword: Parallel computation

Search Result 592, Processing Time 0.026 seconds

Hologram Generation Acceleration Method Using GPGPU (GPGPU를 이용한 홀로그램 생성 가속화 방법)

  • Lee, Yoon-Hyuk;Kim, Dong-Wook;Seo, Young-Ho
    • Journal of Broadcast Engineering
    • /
    • v.22 no.6
    • /
    • pp.800-807
    • /
    • 2017
  • A large amount of computation is required to generate a hologram using a computer. In order to accelerate the computation, many methods of acceleration by parallel programming using GPGPU(General Purpose computing on Graphic Process Unit) have been researched. In this paper, we propose a method of reducing the bottleneck caused by hologram pixel based parallel processing and using the shareable variables. We also propose how to optimize using Visual Profiler supported by nVidia's CUDA to make threads work optimally. The experimental results show that the proposed method reduces the calculation time by up to 40% compared with the existing research.

Efficient Parallel Block-layered Nonbinary Quasi-cyclic Low-density Parity-check Decoding on a GPU

  • Thi, Huyen Pham;Lee, Hanho
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.3
    • /
    • pp.210-219
    • /
    • 2017
  • This paper proposes a modified min-max algorithm (MMMA) for nonbinary quasi-cyclic low-density parity-check (NB-QC-LDPC) codes and an efficient parallel block-layered decoder architecture corresponding to the algorithm on a graphics processing unit (GPU) platform. The algorithm removes multiplications over the Galois field (GF) in the merger step to reduce decoding latency without any performance loss. The decoding implementation on a GPU for NB-QC-LDPC codes achieves improvements in both flexibility and scalability. To perform the decoding on the GPU, data and memory structures suitable for parallel computing are designed. The implementation results for NB-QC-LDPC codes over GF(32) and GF(64) demonstrate that the parallel block-layered decoding on a GPU accelerates the decoding process to provide a faster decoding runtime, and obtains a higher coding gain under a low $10^{-10}$ bit error rate and low $10^{-7}$ frame error rate, compared to existing methods.

Parallel Implementation Strategy for Content Based Video Copy Detection Using a Multi-core Processor

  • Liao, Kaiyang;Zhao, Fan;Zhang, Mingzhu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.10
    • /
    • pp.3520-3537
    • /
    • 2014
  • Video copy detection methods have emerged in recent years for a variety of applications. However, the lack of efficiency in the usual retrieval systems restricts their use. In this paper, we propose a parallel implementation strategy for content based video copy detection (CBCD) by using a multi-core processor. This strategy can support video copy detection effectively, and the processing time tends to decrease linearly as the number of processors increases. Experiments have shown that our approach is successful in speeding up computation and as well as in keeping the performance.

Efficient m-step Generalization of Iterative Methods

  • Kim, Sun-Kyung
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.11 no.5
    • /
    • pp.163-169
    • /
    • 2006
  • In order to use parallel computers in specific applications, algorithms need to be developed and mapped onto parallel computer architectures. Main memory access for shared memory system or global communication in message passing system deteriorate the computation speed. In this paper, it is found that the m-step generalization of the block Lanczos method enhances parallel properties by forming in simultaneous search direction vector blocks. QR factorization, which lowers the speed on parallel computers, is not necessary in the m-step block Lanczos method. The m-step method has the minimized synchronization points, which resulted in the minimized global communications and main memory access compared to the standard methods.

  • PDF

Parallel Procedure and Evaluation of Parallel Performance of Impact Simulation Based on Two-Step Eulerian Scheme (Two-Step Eulerian 기법에 기반 한 충돌 해석의 병렬처리 및 병렬효율 평가)

  • Kim Seung-Jo;Lee Min-Hyung;Paik Seung-Hoon
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.30 no.10 s.253
    • /
    • pp.1320-1327
    • /
    • 2006
  • Parallel procedure and performance of two-step Eulerian code have not been reported sufficiently yet even though it was developed and utilized widely in the impact simulation. In this study, parallel strategy of two-step Eulerian code was proposed and described in detail. The performance was evaluated in the self-made linux cluster computer. Compared with commercial code, a relatively good performance is achieved. Through the performance evaluation of each computation stage, remap is turned out to be the most time consuming part among the other part such as FE processing, communication, time marching etc.

Three-Dimensional Numerical Computation and Experiment on Periodic Flows under a Background Rotation (배경회전하에서 형성되는 주기적 유동의 3차원 수치해석과 실험)

  • Suh, Yong-Kweon;Park, Jae-Hyun
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.27 no.5
    • /
    • pp.628-634
    • /
    • 2003
  • We present numerical and experimental results of periodic flows inside a rectangular container under a background rotation. The periodic flows are generated by changing the speed of rotation periodically so that a time-periodic body forces produce the unsteady flows. In numerical computation, a parallel-computation technique with MPI is implemented. Flow visualization and PIV measurement are also performed to obtain velocity fields at the free surface. Through a series of numerical and experimental works, we aim to clarify, if any, the fundamental reasons \ulcornerf discrepancy between the two-dimensional computation and the experimental measurement, which was detected in the previous study for the same flow model. Specifically, we check if the various assumptions prerequisite for the validity of the classical Ekman pumping law are satisfied for periodic flows under a background rotation.

Dynamic Model of PEM Fuel Cell Using Real-time Simulation Techniques

  • Jung, Jee-Hoon;Ahmed, Shehab
    • Journal of Power Electronics
    • /
    • v.10 no.6
    • /
    • pp.739-748
    • /
    • 2010
  • The increased integration of fuel cells with power electronics, critical loads, and control systems has prompted recent interest in accurate electrical terminal models of the polymer electrolyte membrane (PEM) fuel cell. Advancement in computing technologies, particularly parallel computation techniques and various real-time simulation tools have allowed the prototyping of novel apparatus to be investigated in a virtual system under a wide range of realistic conditions repeatedly, safely, and economically. This paper builds upon both advancements and provides a means of optimized model construction boosting computation speeds for a fuel cell model on a real-time simulator which can be used in a power hardware-in-the-loop (PHIL) application. Significant improvement in computation time has been achieved. The effectiveness of the proposed model developed on Opal RT's RT-Lab Matlab/Simulink based real-time engineering simulator is verified using experimental results from a Ballard Nexa fuel cell system.

Efficient Parallel Visualization of Large-scale Finite Element Analysis Data in Distributed Parallel Computing Environment (분산 병렬 계산환경에 적합한 초대형 유한요소 해석 결과의 효율적 병렬 가시화)

  • Kim, Chang-Sik;Song, You-Me;Kim, Ki-Ook;Cho, Jin-Yeon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.32 no.10
    • /
    • pp.38-45
    • /
    • 2004
  • In this paper, a parallel visualization algorithm is proposed for efficient visualization of the massive data generated from large-scale parallel finite element analysis through investigating the characteristics of parallel rendering methods. The proposed parallel visualization algorithm is designed to be highly compatible with the characteristics of domain-wise computation in parallel finite element analysis by using the sort-last-sparse approach. In the proposed algorithm, the binary tree communication pattern is utilized to reduce the network communication time in image composition routine. Several benchmarking tests are carried out by using the developed in-house software, and the performance of the proposed algorithm is investigated.

NUMERICAL SOLUTION OF EQUILIBRIUM EQUATIONS

  • Jang, Ho-Jong
    • Communications of the Korean Mathematical Society
    • /
    • v.15 no.1
    • /
    • pp.133-142
    • /
    • 2000
  • We consider some numerical solution methods for equilibrium equations Af + E$^{T}$ λ = r, Ef = s. Algebraic problems of this form evolve from many applications such as structural optimization, fluid flow, and circuits. An important approach, called the force method, to the solution to such problems involves dimension reduction nullspace computation for E. The purpose of this paper is to investigate the substructuring method for the solution step of the force method in the context of the incompressible fluid flow. We also suggests some iterative methods based upon substructuring scheme..

  • PDF