• Title/Summary/Keyword: Graphics processing unit

Search Result 191, Processing Time 0.026 seconds

Real-Time GPU Technique for Extracting Mesh Isosurfaces from BCC Volume Datasets (BCC 볼륨 데이터로부터 실시간으로 메시 형태의 등가면을 추출하는 GPU 기법)

  • Kim, Hyunjun;Kim, Minho
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.4
    • /
    • pp.17-26
    • /
    • 2020
  • We present a real-time GPU(Graphic Processing Unit) marching tetrahedra technique that extracts isosurfaces in the indexed mesh format from BCC(Body Centered Cubic) volume datasets. Compared to classical marching tetrahedra, our method shows better performance with little memory overhead. Our technique is composed of five stages. In the first stage, which needs to be done only once, we build min/max blocks that is to be used for empty space skipping to boost the performance. Next, we extract active blocks that contain the current isovalue. In the next two stages, we extract the edges and cells that contain the isosurface and then the final triangular mesh is generated in the last stage. When applied 5123 or higher resolution volume dataset, our technique shows up to 5 times speed improvement compared to the classical marching tetrahedra algorithm.

Performance Evaluation and Verification of MMX-type Instructions on an Embedded Parallel Processor (임베디드 병렬 프로세서 상에서 MMX타입 명령어의 성능평가 및 검증)

  • Jung, Yong-Bum;Kim, Yong-Min;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.10
    • /
    • pp.11-21
    • /
    • 2011
  • This paper introduces an SIMD(Single Instruction Multiple Data) based parallel processor that efficiently processes massive data inherent in multimedia. In addition, this paper implements MMX(MultiMedia eXtension)-type instructions on the data parallel processor and evaluates and analyzes the performance of the MMX-type instructions. The reference data parallel processor consists of 16 processors each of which has a 32-bit datapath. Experimental results for a JPEG compression application with a 1280x1024 pixel image indicate that MMX-type instructions achieves a 50% performance improvement over the baseline instructions on the same data parallel architecture. In addition, MMX-type instructions achieves 100% and 51% improvements over the baseline instructions in energy efficiency and area efficiency, respectively. These results demonstrate that multimedia specific instructions including MMX-type have potentials for widely used many-core GPU(Graphics Processing Unit) and any types of parallel processors.

Fast Stereoscopic 3D Broadcasting System using x264 and GPU (x264와 GPU를 이용한 고속 양안식 3차원 방송 시스템)

  • Choi, Jung-Ah;Shin, In-Yong;Ho, Yo-Sung
    • Journal of Broadcast Engineering
    • /
    • v.15 no.4
    • /
    • pp.540-546
    • /
    • 2010
  • Since the stereoscopic 3-dimensional (3D) video that provides users with a realistic multimedia service requires twice as much data as 2-dimensional (2D) video, it is difficult to construct the fast system. In this paper, we propose a fast stereoscopic 3D broadcasting system based on the depth information. Before the transmission, we encode the input 2D+depth video using x264, an open source H.264/AVC fast encoder to reduce the size of the data. At the receiver, we decode the transmitted bitstream in real time using a compute unified device architecture (CUDA) video decoder API on NVIDIA graphics processing unit (GPU). Then, we apply a fast view synthesis method that generates the virtual view using GPU. The proposed system can display the output video in both 2DTV and 3DTV. From the experiment, we verified that the proposed system can service the stereoscopic 3D contents in 24 frames per second at most.

Image based Relighting Using HDRI Enviroment Map & Progressive refinement radiosity on GPU (HDRI 환경맵과 GPU 기반 점진적 세분 래디오시티를 이용한 영상기반 재조명)

  • Kim, Jun-Hwan;Hong, Hyun-Ki
    • Journal of Korea Game Society
    • /
    • v.7 no.4
    • /
    • pp.53-62
    • /
    • 2007
  • Although radiosity can represent diffuse reflections of the object surfaces by modeling energy exchange in 3D space, there are some restrictions for real-time applications because of its computation loads. Therefore, GPU(Graphics Processing Unit) based radiosity algorithms have been presented actively to improve its rendering performance. We implement the progressive refinement radiosity on GPU by G. Coombe in 3D scene that is constructed with HDR(High Dynamic Range) radiance map. This radiosity method can generate a photo-realistic rendering image in 3D space, where the synthetic objects were illuminated by the environmental light sources. In the simulation results, the rendering performance is analyzed according to the resolution of the texel in the environmental map and mipmaping. In addition, we compare the rendering results by our method with those by the incremental radiosity.

  • PDF

A design of GPU container co-execution framework measuring interference among applications (GPU 컨테이너 동시 실행에 따른 응용의 간섭 측정 프레임워크 설계)

  • Kim, Sejin;Kim, Yoonhee
    • KNOM Review
    • /
    • v.23 no.1
    • /
    • pp.43-50
    • /
    • 2020
  • As General Purpose Graphics Processing Unit (GPGPU) recently plays an essential role in high-performance computing, several cloud service providers offer GPU service. Most cluster orchestration platforms in a cloud environment using containers allocate the integer number of GPU to jobs and do not allow a node shared with other jobs. In this case, resource utilization of a GPU node might be low if a job does not intensively require either many cores or large size of memory in GPU. GPU virtualization brings opportunities to realize kernel concurrency and share resources. However, performance may vary depending on characteristics of applications running concurrently and interference among them due to resource contention on a node. This paper proposes GPU container co-execution framework with multiple server creation and execution based on Kubernetes, container orchestration platform for measuring interference which may be occurred by sharing GPU resources. Performance changes according to scheduling policies were investigated by executing several jobs on GPU. The result shows that optimal scheduling is not possible only considering GPU memory and computing resource usage. Interference caused by co-execution among applications is measured using the framework.

Implementation of $2{\times}2$ MIMO LTE Base Station using GPU for SDR System (GPU를 이용한 SDR 시스템 용 LTE MIMO 기지국 기능 구현)

  • Lee, Seung Hak;Kim, Kyung Hoon;Ahn, Chi Young;Choi, Seung Won
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.8 no.4
    • /
    • pp.91-98
    • /
    • 2012
  • This paper implements 2X2 MIMO Long Term Evolution (LTE) base station using Software defined radio (SDR) technology. The implemented base station system processes baseband signals on a Graphics Processor Unit(GPU). GPU is a high-speed parallel processor which provides very important advantage of using a very powerful C-based programming environment that is Compute Unified Device Architecture (CUDA). The implemented software-based base station system processes baseband signals through GPU. It utilizes USRP2 as its RF transceiver. In order to guarantee a real-time processing of LTE baseband signals, we have adopted well-known signal processing algorithms such as frame synchronization algorithms, ML detection, etc. using GPU operating in parallel processing.

Kinematic Wave Rainfall-Runoff Model Using CUDA FORTRAN (CUDA FORTRAN을 이용한 운동파 강우유출모형)

  • Kim, Boram;Kim, Dae-Hong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.271-271
    • /
    • 2018
  • 그래픽 처리 장치(GPU: Graphic Processing Units)는 그래픽 처리에 특화된 수많은 산술논리연산자 (ALU: Arithmetic Logic Unit)와 이에 관련된 인스트럭션Instruction)으로 인해 중앙 처리 장치(CPU: Central Processing Units) 보다 훨씬 빠른 계산 처리를 수행할 수 있다. 최근에는 FORTRAN에 의해 구현된 많은 수치모형들이 현실적인 모델링 방법의 발달로 인해 더 많은 계산량과 계산시간을 필요로 한다. 이 연구에서는 GPU 상의 범용 계산GPGPU : General-Purpose computing on Graphics Processing Units) 기반 운동파 강우유출모형(Kinematic Wave Rainfall-Runoff Model)이 CUDA(Compute Unified Device Architecture) FORTRAN을 사용하여 구현되었다. CUDA FORTRAN 운동파 강우유출모형의 계산 결과는 검증된 CPU 기반 운동파 강우유출모형의 계산 결과와 비교하여 검증되었으며, 잘 일치함을 보여 주었다. CUDA FORTRAN 운동파 강우유출모형은 CPU 기반 모형에 비해 약 20 배 더 빠른 계산 시간을 보였다. 또한 계산 영역이 커짐에 따라 CPU 버전에 비해 CUDA FORTRAN 버전의 계산 효율이 향상되었다.

  • PDF

Study about Low-Cost Autonomous Driving Simulator Framework Based on 3D LIDAR (33D LIDAR 를 기반으로 하는 저비용 자율 주행 시뮬레이터 프레임워크에 대한 연구)

  • O, Eun Taek;Cho, Min Woo;Gu, Bon Woo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.702-704
    • /
    • 2022
  • 자율주행 시뮬레이터를 위한 대체재로 게임 엔진을 통한 가상 환경 모의 연구가 수행되고 있다. 하지만 게임 엔진에서는 자율 주행에 필요한 센서를 기기에 맞게 사용자가 직접 모델링을 해줘야 하기 때문에 개발 비용이 크게 작용된다. 특히, Ray 를 활용한 3D LIDAR 는 GPU(Graphics Processing Unit) 사용량이 많은 작업이기 때문에 저비용 시뮬레이터를 위해서는 저비용 3D LIDAR 모의가 필요하다. 본 논문에서는 낮은 컴퓨터 연산을 사용하는 C++ 기반 3D LIDAR 모의 프레임 워크를 제안한다. 제안된 3D LIDAR 는 다수의 언덕으로 이루어진 비포장 Map 에서 성능을 검증 하였으며, 성능 검증을 의해 본 논문에서 생성된 3D LIDAR 로 간단한 LPP(Local Path Planning) 생성 방법도 소개한다. 제안된 3D LIDAR 프레임 워크는 저비용 실시간 모의가 필요한 자율 주행 분야에 적극 활용되길 바란다.

Parallel Range Query processing on R-tree with Graphics Processing Units (GPU를 이용한 R-tree에서의 범위 질의의 병렬 처리)

  • Yu, Bo-Seon;Kim, Hyun-Duk;Choi, Won-Ik;Kwon, Dong-Seop
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.5
    • /
    • pp.669-680
    • /
    • 2011
  • R-trees are widely used in various areas such as geographical information systems, CAD systems and spatial databases in order to efficiently index multi-dimensional data. As data sets used in these areas grow in size and complexity, however, range query operations on R-tree are needed to be further faster to meet the area-specific constraints. To address this problem, there have been various research efforts to develop strategies for acceleration query processing on R-tree by using the buffer mechanism or parallelizing the query processing on R-tree through multiple disks and processors. As a part of the strategies, approaches which parallelize query processing on R-tree through Graphics Processor Units(GPUs) have been explored. The use of GPUs may guarantee improved performances resulting from faster calculations and reduced disk accesses but may cause additional overhead costs caused by high memory access latencies and low data exchange rate between GPUs and the CPU. In this paper, to address the overhead problems and to adapt GPUs efficiently, we propose a novel approach which uses a GPU as a buffer to parallelize query processing on R-tree. The use of buffer algorithm can give improved performance by reducing the number of disk access and maximizing coalesced memory access resulting in minimizing GPU memory access latencies. Through the extensive performance studies, we observed that the proposed approach achieved up to 5 times higher query performance than the original CPU-based R-trees.

Performance Study of Satellite Image Processing on Graphics Processors Unit Using CUDA

  • Jeong, In-Kyu;Hong, Min-Gee;Hahn, Kwang-Soo;Choi, Joonsoo;Kim, Choen
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.6
    • /
    • pp.683-691
    • /
    • 2012
  • High resolution satellite images are now widely used for a variety of mapping applications including photogrammetry, GIS data acquisition and visualization. As the spectral and spatial data size of satellite images increases, a greater processing power is needed to process the images. The solution of these problems is parallel systems. Parallel processing techniques have been developed for improving the performance of image processing along with the development of the computational power. However, conventional CPU-based parallel computing is often not good enough for the demand for computational speed to process the images. The GPU is a good candidate to achieve this goal. Recently GPUs are used in the field of highly complex processing including many loop operations such as mathematical transforms, ray tracing. In this study we proposed a technique for parallel processing of high resolution satellite images using GPU. We implemented a spectral radiometric processing algorithm on Landsat-7 ETM+ imagery using CUDA, a parallel computing architecture developed by NVIDIA for GPU. Also performance of the algorithm on GPU and CPU is compared.