• Title/Summary/Keyword: GPU parallel processing

Search Result 226, Processing Time 0.031 seconds

Fast Multi-View Synthesis Using Duplex Foward Mapping and Parallel Processing (순차적 이중 전방 사상의 병렬 처리를 통한 다중 시점 고속 영상 합성)

  • Choi, Ji-Youn;Ryu, Sae-Woon;Shin, Hong-Chang;Park, Jong-Il
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.11B
    • /
    • pp.1303-1310
    • /
    • 2009
  • Glassless 3D display requires multiple images taken from different viewpoints to show a scene. The simplest way to get multi-view image is using multiple camera that as number of views are requires. To do that, synchronize between cameras or compute and transmit lots of data comes critical problem. Thus, generating such a large number of viewpoint images effectively is emerging as a key technique in 3D video technology. Image-based view synthesis is an algorithm for generating various virtual viewpoint images using a limited number of views and depth maps. In this paper, because the virtual view image can be express as a transformed image from real view with some depth condition, we propose an algorithm to compute multi-view synthesis from two reference view images and their own depth-map by stepwise duplex forward mapping. And also, because the geometrical relationship between real view and virtual view is repetitively, we apply our algorithm into OpenGL Shading Language which is a programmable Graphic Process Unit that allow parallel processing to improve computation time. We demonstrate the effectiveness of our algorithm for fast view synthesis through a variety of experiments with real data.

Real-time Depth Image Refinement using Hierarchical Joint Bilateral Filter (계층적 결합형 양방향 필터를 이용한 실시간 깊이 영상 보정 방법)

  • Shin, Dong-Won;Hoa, Yo-Sung
    • Journal of Broadcast Engineering
    • /
    • v.19 no.2
    • /
    • pp.140-147
    • /
    • 2014
  • In this paper, we propose a method for real-time depth image refinement. In order to improve the quality of the depth map acquired from Kinect camera, we employ constant memory and texture memory which are suitable for a 2D image processing in the graphics processing unit (GPU). In addition, we applied the joint bilateral filter (JBF) in parallel to accelerate the overall execution. To enhance the quality of the depth image, we applied the JBF hierarchically using the compute unified device architecture (CUDA). Finally, we obtain the refined depth image. Experimental results showed that the proposed real-time depth image refinement algorithm improved the subjective quality of the depth image and the computational time was 260 frames per second.

Non-Photorealistic Rendering Using CUDA-Based Image Segmentation (CUDA 기반 영상 분할을 사용한 비사실적 렌더링)

  • Yoon, Hyun-Cheol;Park, Jong-Seung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.11
    • /
    • pp.529-536
    • /
    • 2015
  • When rendering both three-dimensional objects and photo images together, the non-photorealistic rendering results are in visual discord since the two contents have their own independent color distributions. This paper proposes a non-photorealistic rendering technique which renders both three-dimensional objects and photo images such as cartoons and sketches. The proposed technique computes the color distribution property of the photo images and reduces the number of colors of both photo images and 3D objects. NPR is performed based on the reduced colormaps and edge features. To enhance the natural scene presentation, the image region segmentation process is preferred when extracting and applying colormaps. However, the image segmentation technique needs a lot of computational operations. It takes a long time for non-photorealistic rendering for large size frames. To speed up the time-consuming segmentation procedure, we use GPGPU for the parallel computing using the GPU. As a result, we significantly improve the execution speed of the algorithm.

Reevaluating the overhead of data preparation for asymmetric multicore system on graphics processing

  • Pei, Songwen;Zhang, Junge;Jiang, Linhua;Kim, Myoung-Seo;Gaudiot, Jean-Luc
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.7
    • /
    • pp.3231-3244
    • /
    • 2016
  • As processor design has been transiting from homogeneous multicore processor to heterogeneous multicore processor, traditional Amdahl's law cannot meet the new challenges for asymmetric multicore system. In order to further investigate the impact factors related to the Overhead of Data Preparation (ODP) for Asymmetric multicore systems, we evaluate an asymmetric multicore system built with CPU-GPU by measuring the overheads of memory transfer, computing kernel, cache missing and synchronization. This paper demonstrates that decreasing the overhead of data preparation is a promising approach to improve the whole performance of heterogeneous system.

GPU-based Monte Carlo Photon Migration Algorithm with Path-partition Load Balancing

  • Jeon, Youngjin;Park, Jongha;Hahn, Joonku;Kim, Hwi
    • Current Optics and Photonics
    • /
    • v.5 no.6
    • /
    • pp.617-626
    • /
    • 2021
  • A parallel Monte Carlo photon migration algorithm for graphics processing units that implements an improved load-balancing strategy is presented. Conventional parallel Monte Carlo photon migration algorithms suffer from a computational bottleneck due to their reliance on a simple load-balancing strategy that does not take into account the different length of the mean free paths of the photons. In this paper, path-partition load balancing is proposed to eliminate this computational bottleneck based on a mathematical formula that parallelizes the photon path tracing process, which has previously been considered non-parallelizable. The performance of the proposed algorithm is tested using three-dimensional photon migration simulations of a human skin model.

Face Detection using Skin Color Information and Parallel Processing Method on Multi-Core (멀티코어에서 피부색상 정보와 병렬처리 방법을 이용한 얼굴 검출)

  • Kim, Hong-Hee;Lee, Jae-Heung
    • Annual Conference of KIPS
    • /
    • 2012.11a
    • /
    • pp.219-222
    • /
    • 2012
  • 최근 얼굴검출에 관한 연구는 FPGA를 통한 H/W설계부터 DSP, GPU, ARM Core에 효율적인 S/W 설계까지 다양하게 연구되고 있다. 본 연구에서는 Multi-Core에 효과적인 얼굴검출 방법을 제안한다. 피부색을 통한 얼굴 후보를 추출하고 그 외의 배경 이미지는 삭제하여 연산처리를 빠르게 하였다. Viola-Jones가 제안한 얼굴검출 알고리즘을 POSIX Thread를 사용하여 병렬 처리하였고 그 성능을 단일 코어와 멀티코어에서 측정하였다. 단일 코어에서는 성능의 향상이 없었으나 멀티코어에서는 약 1.8배 속도가 향상되었고 검출 성공률은 기존과 동일하였다.

OpenCL-based Efficient Parallel Processing in a Heterogeneous Computing Environment (이기종 컴퓨팅 환경에서 OpenCL을 이용한 효율적인 병렬처리)

  • Kim, Heegon;Lee, Sungju;Chung, Yongwha;Park, Daihee
    • Annual Conference of KIPS
    • /
    • 2013.11a
    • /
    • pp.111-114
    • /
    • 2013
  • 최근 고성능 컴퓨팅과 모바일 컴퓨팅에서 GPU 등의 성능가속기 사용이 증가함에 따라 성능가속기를 사용한 다양한 병렬처리 방법이 소개되고 있다. 그러나 성능 가속기를 처음 접하거나 성능가속기를 사용한 병렬처리 경험이 적은 사용자의 경우, 이러한 성능가속기를 이용하여 효과적인 병렬처리를 하는 것은 쉽지 않다. 본 논문에서는 성능가속기와 마이크로프로세서를 동시에 사용하여 단순히 성능가속기만을 사용한 병렬처리보다 효율적인 병렬처리 방법을 제안하고, 성능가속기만을 사용하여 얻은 성능과 제안한 방법의 성능을 비교한다. 실험결과, 제안방법은 순차처리와 비교하여 약 40배의 성능 향상을 얻을 수 있었고, 성능가속기만을 사용한 병렬처리 방법보다도 25%의 성능 향상이 가능함을 확인하였다.

FLUID SIMULATION METHODS FOR COMPUTER GRAPHICS SPECIAL EFFECTS (컴퓨터 그래픽스 특수효과를 위한 유체시뮬레이션 기법들)

  • Jung, Moon-Ryul
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2009.11a
    • /
    • pp.1-1
    • /
    • 2009
  • In this presentation, I talk about various fluid simulation methods that have been developed for computer graphics special effects since 1996. They are all based on CFD but sacrifice physical reality for visual plausability and time. But as the speed of computer increases rapidly and the capability of GPU (graphics processing unit) improves, methods for more physical realism have been tried. In this talk, I will focus on four aspects of fluid simulation methods for computer graphics: (1) particle level-set methods, (2) particle-based simulation, (3) methods for exact satisfaction of incompressibility constraint, and (4) GPU-based simulation. (1) Particle level-set methods evolve the surface of fluid by means of the zero-level set and a band of massless marker particles on both sides of it. The evolution of the zero-level set captures the surface in an approximate manner and the evolution of marker particles captures the fine details of the surface, and the zero-level set is modified based on the particle positions in each step of evolution. (2) Recently the particle-based Lagrangian approach to fluid simulation gains some popularity, because it automatically respects mass conservation and the difficulty of tracking the surface geometry has been somewhat addressed. (3) Until recently fluid simulation algorithm was dominated by approximate fractional step methods. They split the Navier-Stoke equation into two, so that the first one solves the equation without considering the incompressibility constraint and the second finds the pressure which satisfies the constraint. In this approach, the first step introduces error inevitably, producing numerical diffusion in solution. But recently exact fractional step methods without error have been developed by fluid mechanics scholars), and another method was introduced which satisfies the incompressibility constraint by formulating fluid in terms of vorticity field rather than velocity field (by computer graphics scholars). (4) Finally, I want to mention GPU implementation of fluid simulation, which takes advantage of the fact that discrete fluid equations can be solved in parallel.

  • PDF

Acceleration of Anisotropic Elastic Reverse-time Migration with GPUs (GPU를 이용한 이방성 탄성 거꿀 참반사 보정의 계산가속)

  • Choi, Hyungwook;Seol, Soon Jee;Byun, Joongmoo
    • Geophysics and Geophysical Exploration
    • /
    • v.18 no.2
    • /
    • pp.74-84
    • /
    • 2015
  • To yield physically meaningful images through elastic reverse-time migration, the wavefield separation which extracts P- and S-waves from reconstructed vector wavefields by using elastic wave equation is prerequisite. For expanding the application of the elastic reverse-time migration to anisotropic media, not only the anisotropic modelling algorithm but also the anisotropic wavefield separation is essential. The anisotropic wavefield separation which uses pseudo-derivative filters determined according to vertical velocities and anisotropic parameters of elastic media differs from the Helmholtz decomposition which is conventionally used for the isotropic wavefield separation. Since applying these pseudo-derivative filter consumes high computational costs, we have developed the efficient anisotropic wavefield separation algorithm which has capability of parallel computing by using GPUs (Graphic Processing Units). In addition, the highly efficient anisotropic elastic reverse-time migration algorithm using MPI (Message-Passing Interface) and incorporating the developed anisotropic wavefield separation algorithm with GPUs has been developed. To verify the efficiency and the validity of the developed anisotropic elastic reverse-time migration algorithm, a VTI elastic model based on Marmousi-II was built. A synthetic multicomponent seismic data set was created using this VTI elastic model. The computational speed of migration was dramatically enhanced by using GPUs and MPI and the accuracy of image was also improved because of the adoption of the anisotropic wavefield separation.

3D Inspection by Registration of CT and Dual X-ray Images

  • Kim, Youngjun;Kim, Wontae;Lee, Deukhee
    • Journal of International Society for Simulation Surgery
    • /
    • v.3 no.1
    • /
    • pp.16-21
    • /
    • 2016
  • Computed tomography (CT) can completely digitize the interior and the exterior of nearly any object without any destruction. Generally, the resolution for industrial CT is below a few microns. The industrial CT scanning, however, has a limitation because it requires long measuring and processing time. Whereas, 2D X-ray imaging is fast. In this paper, we propose a novel concept of 3D non-destructive inspection technique using the advantages of both micro-CT and dual X-ray images. After registering the master object’s CT data and the sample objects’ dual X-ray images, 3D non-destructive inspection is possible by analyzing the matching results. Calculation for the registration is accelerated by parallel computing using graphics processing unit (GPU).