• Title/Summary/Keyword: Parallel GPU

Search Result 284, Processing Time 0.028 seconds

Real-time Depth Image Refinement using Hierarchical Joint Bilateral Filter (계층적 결합형 양방향 필터를 이용한 실시간 깊이 영상 보정 방법)

  • Shin, Dong-Won;Hoa, Yo-Sung
    • Journal of Broadcast Engineering
    • /
    • v.19 no.2
    • /
    • pp.140-147
    • /
    • 2014
  • In this paper, we propose a method for real-time depth image refinement. In order to improve the quality of the depth map acquired from Kinect camera, we employ constant memory and texture memory which are suitable for a 2D image processing in the graphics processing unit (GPU). In addition, we applied the joint bilateral filter (JBF) in parallel to accelerate the overall execution. To enhance the quality of the depth image, we applied the JBF hierarchically using the compute unified device architecture (CUDA). Finally, we obtain the refined depth image. Experimental results showed that the proposed real-time depth image refinement algorithm improved the subjective quality of the depth image and the computational time was 260 frames per second.

Fast Generation of Digital Video Holograms Using Multiple PCs (다수의 PC를 이용한 디지털 비디오 홀로그램의 고속 생성)

  • Park, Hanhoon;Kim, Changseob;Park, Jong-Il
    • Journal of Broadcast Engineering
    • /
    • v.22 no.4
    • /
    • pp.509-518
    • /
    • 2017
  • High-resolution digital holograms can be quickly generated by using a PC cluster that is based on server-client architecture and is composed of several GPU-equipped PCs. However, the data transmission time between PCs becomes a large obstacle for fast generation of video holograms because it linearly increases in proportion to the number of frames. To resolve the problem with the increase of data transmission time, this paper proposes a multi-threading-based method. Hologram generation in each client PC basically consists of three processes: acquisition of light sources, CGH operation using GPUs, and transmission of the result to the server PC. Unlike the previous method that sequentially executes the processes, the proposed method executes in parallel them by multi-threading and thus can significantly reduce the proportion of the data transmission time to the total hologram generation time. Through experiments, it was confirmed that the total generation time of a high-resolution video hologram with 150 frames can be reduced by about 30%.

An Algorithm for Finding Surface Atoms of a Protein Molecule Based on Voxel Map Representation (복셀 맵을 이용한 단백질 표면 원자의 발견 알고리즘)

  • Kim, Byung-Joo;Kim, Ku-Jin;Seong, Joon-Kyung
    • The KIPS Transactions:PartA
    • /
    • v.19A no.2
    • /
    • pp.73-76
    • /
    • 2012
  • In this paper, we propose an efficient method to extract surface atoms from a protein molecule. Surface atoms are defined as a set of atoms who can contact given probe solvent $P$, where $P$ does not collide with the molecule. The atoms contained in the molecule are represented as a set of spheres with van der Waals radii. The probe solvent also is represented as a sphere. We propose a method to extract the surface atoms by computing the offset surface of the molecule with respect to the radius of $P$. For efficient computation of the offset surface of a molecule, a voxel map is constructed for the offset surfaces of the spheres. Based on GPU (graphic processor unit) acceleration, a data parallel algorithm is used to extract the surface atoms in 42.87 milliseconds for the molecule containing up to 6,412 atoms.

Real-time Virtual View Synthesis using Virtual Viewpoint Disparity Estimation and Convergence Check (가상 변이맵 탐색과 수렴 조건 판단을 이용한 실시간 가상시점 생성 방법)

  • Shin, In-Yong;Ho, Yo-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.1A
    • /
    • pp.57-63
    • /
    • 2012
  • In this paper, we propose a real-time view interpolation method using virtual viewpoint disparity estimation and convergence check. For the real-time process, we estimate a disparity map at the virtual viewpoint from stereo images using the belief propagation method. This method needs only one disparity map, compared to the conventional methods that need two disparity maps. In the view synthesis part, we warp pixels from the reference images to the virtual viewpoint image using the disparity map at the virtual viewpoint. For real-time acceleration, we utilize a high speed GPU parallel programming, called CUDA. As a result, we can interpolate virtual viewpoint images in real-time.

Non-Photorealistic Rendering Using CUDA-Based Image Segmentation (CUDA 기반 영상 분할을 사용한 비사실적 렌더링)

  • Yoon, Hyun-Cheol;Park, Jong-Seung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.11
    • /
    • pp.529-536
    • /
    • 2015
  • When rendering both three-dimensional objects and photo images together, the non-photorealistic rendering results are in visual discord since the two contents have their own independent color distributions. This paper proposes a non-photorealistic rendering technique which renders both three-dimensional objects and photo images such as cartoons and sketches. The proposed technique computes the color distribution property of the photo images and reduces the number of colors of both photo images and 3D objects. NPR is performed based on the reduced colormaps and edge features. To enhance the natural scene presentation, the image region segmentation process is preferred when extracting and applying colormaps. However, the image segmentation technique needs a lot of computational operations. It takes a long time for non-photorealistic rendering for large size frames. To speed up the time-consuming segmentation procedure, we use GPGPU for the parallel computing using the GPU. As a result, we significantly improve the execution speed of the algorithm.

Fast Multi-View Synthesis Using Duplex Foward Mapping and Parallel Processing (순차적 이중 전방 사상의 병렬 처리를 통한 다중 시점 고속 영상 합성)

  • Choi, Ji-Youn;Ryu, Sae-Woon;Shin, Hong-Chang;Park, Jong-Il
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.11B
    • /
    • pp.1303-1310
    • /
    • 2009
  • Glassless 3D display requires multiple images taken from different viewpoints to show a scene. The simplest way to get multi-view image is using multiple camera that as number of views are requires. To do that, synchronize between cameras or compute and transmit lots of data comes critical problem. Thus, generating such a large number of viewpoint images effectively is emerging as a key technique in 3D video technology. Image-based view synthesis is an algorithm for generating various virtual viewpoint images using a limited number of views and depth maps. In this paper, because the virtual view image can be express as a transformed image from real view with some depth condition, we propose an algorithm to compute multi-view synthesis from two reference view images and their own depth-map by stepwise duplex forward mapping. And also, because the geometrical relationship between real view and virtual view is repetitively, we apply our algorithm into OpenGL Shading Language which is a programmable Graphic Process Unit that allow parallel processing to improve computation time. We demonstrate the effectiveness of our algorithm for fast view synthesis through a variety of experiments with real data.

Real-time Style Transfer for Video (실시간 비디오 스타일 전이 기법에 관한 연구)

  • Seo, Sang Hyun
    • Smart Media Journal
    • /
    • v.5 no.4
    • /
    • pp.63-68
    • /
    • 2016
  • Texture transfer is a method to transfer the texture of an input image into a target image, and is also used for transferring artistic style of the input image. This study presents a real-time texture transfer for generating artistic style video. In order to enhance performance, this paper proposes a parallel framework using T-shape kernel used in general texture transfer on GPU. To accelerate motion computation time which is necessarily required for maintaining temporal coherence, a multi-scaled motion field is proposed in parallel concept. Through these approach, an artistic texture transfer for video with a real-time performance is archived.

A Polarization-based Frequency Scanning Interferometer and the Measurement Processing Acceleration based on Parallel Programing (편광 기반 주파수 스캐닝 간섭 시스템 및 병렬 프로그래밍 기반 측정 고속화)

  • Lee, Seung Hyun;Kim, Min Young
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.8
    • /
    • pp.253-263
    • /
    • 2013
  • Frequency Scanning Interferometry(FSI) system, one of the most promising optical surface measurement techniques, generally results in superior optical performance comparing with other 3-dimensional measuring methods as its hardware structure is fixed in operation and only the light frequency is scanned in a specific spectral band without vertical scanning of the target surface or the objective lens. FSI system collects a set of images of interference fringe by changing the frequency of light source. After that, it transforms intensity data of acquired image into frequency information, and calculates the height profile of target objects with the help of frequency analysis based on Fast Fourier Transform(FFT). However, it still suffers from optical noise on target surfaces and relatively long processing time due to the number of images acquired in frequency scanning phase. 1) a Polarization-based Frequency Scanning Interferometry(PFSI) is proposed for optical noise robustness. It consists of tunable laser for light source, ${\lambda}/4$ plate in front of reference mirror, ${\lambda}/4$ plate in front of target object, polarizing beam splitter, polarizer in front of image sensor, polarizer in front of the fiber coupled light source, ${\lambda}/2$ plate between PBS and polarizer of the light source. Using the proposed system, we can solve the problem of fringe image with low contrast by using polarization technique. Also, we can control light distribution of object beam and reference beam. 2) the signal processing acceleration method is proposed for PFSI, based on parallel processing architecture, which consists of parallel processing hardware and software such as Graphic Processing Unit(GPU) and Compute Unified Device Architecture(CUDA). As a result, the processing time reaches into tact time level of real-time processing. Finally, the proposed system is evaluated in terms of accuracy and processing speed through a series of experiment and the obtained results show the effectiveness of the proposed system and method.

Procedural Fluid Animation using Mirror Image Method

  • Park, Jin-Ho
    • International Journal of Contents
    • /
    • v.7 no.4
    • /
    • pp.1-5
    • /
    • 2011
  • Physics based fluid animation schemes need large computation cost due to tremendous degree of freedom. Many researchers tried to reduce the cost for solving the large linear system that is involved in grid-based schemes. GPU based algorithms and advanced numerical analysis methods are used to efficiently solve the system. Other groups studied local operation methods such as SPH (Smoothed Particle Hydrodynamics) and LBM (Lattice Boltzmann Method) for enhancing the efficiency. Our method investigates this efficiency problem thoroughly, and suggests novel paradigm in fluid animation field. Rather than physics based simulation, we propose a robust boundary handling technique for procedural fluid animation. Our method can be applied to arbitrary shaped objects and potential fields. Since only local operations are involved in our method, parallel computing can be easily implemented.

CUDA-based Fast DRR Generation for Analysis of Medical Images (의료영상 분석을 위한 CUDA 기반의 고속 DRR 생성 기법)

  • Yang, Sang-Wook;Choi, Young;Koo, Seung-Bum
    • Korean Journal of Computational Design and Engineering
    • /
    • v.16 no.4
    • /
    • pp.285-291
    • /
    • 2011
  • A pose estimation process from medical images is calculating locations and orientations of objects obtained from Computed Tomography (CT) volume data utilizing X-ray images from two directions. In this process, digitally reconstructed radiograph (DRR) images of spatially transformed objects are generated and compared to X-ray images repeatedly until reasonable transformation matrices of the objects are found. The DRR generation and image comparison take majority of the total time for this pose estimation. In this paper, a fast DRR generation technique based on GPU parallel computing is introduced. A volume ray-casting algorithm is explained with brief vector operations and a parallelization technique of the algorithm using Compute Unified Device Architecture (CUDA) is discussed. This paper also presents the implementation results and time measurements comparing to those from pure-CPU implementation and open source toolkit.