• Title/Summary/Keyword: GPU Acceleration

Search Result 76, Processing Time 0.024 seconds

Acceleration of Range Query in R-tree Using GPU Parallel Processing (GPU를 이용한 R-tree의 질의처리 병렬화)

  • Kim, Min-Cheol;Choi, Won-Ik
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.37-40
    • /
    • 2011
  • 계층적 색인 구조는 대용량의 다차원 데이터에 대한 범위질의를 가장 효율적으로 처리하는 색인 구조이다. 계층적 색인 구조에서 범위질의의 속도를 향상시키기 위해서 색인 구조의 구성 시 발생하는 인접노드간의 겹치는 영역을 줄이는 기법들과 다량의 데이터를 한 번에 읽어 상향식 방식으로 색인 구조의 공간 활용도를 증가시키는 벌크 로딩 기법들이 제안되었다. 하지만 CPU기반에서 개별의 노드들을 순차적으로 질의처리 하는 계층적 색인 구조는 공간 활용도의 증가와 노드 간의 중첩 영역을 줄이는 것만으로는 질의 처리 성능 향상에 한계가 있다. 따라서 본 논문에서는 기존의 CPU기반 계층적 색인 구조 중의 대표적인 예인 R-tree의 저장 구조를 GPU 메모리에 적합하도록 변경을 하였다. 또한 기존 CPU기반 계층적 색인 구조의 순차적인 노드 검색을 GPU를 이용해 병렬적으로 노드를 검사하여 성능을 향상시켰다. 이와 같은 방식으로 질의 영역의 크기에 따라서 성능 향상정도가 다르지만 최대 100배 이상의 성능을 향상시켰다.

Interactive Hair Styling Interface (인터랙티브 헤어 스타일링 인터페이스)

  • Cho, Jung-Hyun;Ko, Hyeong-Seok
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.455-458
    • /
    • 2009
  • The statistical wisp model for hairstyle generation was introduced in [1]. It provided a program to load human models, set parameters, generate wisps and strands, and make constraints. However, the program used hard-coded human models and prescribed constraints so that it was hard to change different models and manipulate constraints. Hence we provide a simple interface by drawing maps and constraints. Also, we can increase the speed of computation by using GPU acceleration.

  • PDF

An Acceleration Technique of Terrain Rendering using GPU-based Chunk LOD (GPU 기반의 묶음 LOD 기법을 이용한 지형 렌더링의 가속화 기법)

  • Kim, Tae-Gwon;Lee, Eun-Seok;Shin, Byeong-Seok
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.1
    • /
    • pp.69-76
    • /
    • 2014
  • It is hard to represent massive terrain data in real-time even using recent graphics hardware. In order to process massive terrain data, mesh simplification method such as continuous Level-of-Detail is commonly used. However, existing GPU-based methods using quad-tree structure such as geometry splitting, produce lots of vertices to traverse the quad-tree and retransmit those vertices back to the GPU in each tree traversal. Also they have disadvantage of increase of tree size since they construct the tree structure using texture. To solve the problem, we proposed GPU-base chunked LOD technique for real-time terrain rendering. We restrict depth of tree search and generate chunks with tessellator in GPU. By using our method, we can efficiently render the terrain by generating the chunks on GPU and reduce the computing time for tree traversal.

GPU-Based Acceleration of Quantum-Inspired Evolutionary Algorithm (GPU를 이용한 Quantum-Inspired Evolutionary Algorithm 가속)

  • Ryoo, Ji-Hyun;Park, Han-Min;Choi, Ki-Young
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.8
    • /
    • pp.1-9
    • /
    • 2012
  • Quantum-Inspired Evolutionary Algorithm(QEA) contains sufficient data-level parallelism to be naturally accelerated on GPUs. For an efficient reduction of execution time, however, careful task-mapping should be done to properly reflect the characteristics of CPU and GPU. Furthermore, when deciding which part of the application should run on GPU, we need to consider the data transfer between CPU and GPU memory spaces as well as the data-level parallelism. In addition, the usage of zero-copy host memory, proper choice of the execution configuration, and thread organization considering memory coalescing is important to further reduce the execution time. With all these techniques, we could run QEA 3.69 times faster on average in comparison with the multi-threading CPU for the case of 0-1 knapsack problem with 30,000 items.

Min-Max Octree Generation Using CUDA (CUDA를 이용한 최대-최소 8진트리 생성 기법)

  • Lim, Jong-Hyeon;Shin, Byeong-Seok
    • Journal of Korea Game Society
    • /
    • v.9 no.6
    • /
    • pp.191-196
    • /
    • 2009
  • Volume rendering is a method which extracts meaningful information from volume data and visualizes those information. In general, since the size of volume data gets larger, it is very important to devise acceleration methods for interactive rendering speed. Min-max octree is data structure for high-speed volume rendering, however, its creation time becomes long as the data size increases. In this paper, we propose acceleration method of min-max octree generation using CUDA. Firstly, we convert one-dimensional array from volume data using space filling curve. Then we make min-max octree structures from the sequential array and apply them to acceleration of volume ray casting.

  • PDF

Acceleration of the Iterative Physical Optics Using Graphic Processing Unit (GPU를 이용한 반복적 물리 광학법의 가속화에 대한 연구)

  • Lee, Yong-Hee;Chin, Huicheol;Kim, Kyung-Tae
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.26 no.11
    • /
    • pp.1012-1019
    • /
    • 2015
  • This paper shows the acceleration of iterative physical optics(IPO) for radar cross section(RCS) by using two techniques effectively. For the analysis of the multiple reflection in the cavity, IPO uses the near field method, unlike shooting and bouncing rays method which uses the geometric optics(GO). However, it is still far slower than physical optics(PO) and it is needed to accelerate the speed of IPO for practical purpose. In order to address this problem, graphic processing unit(GPU) can be applied to reduce calculation time and adaptive iterative physical optics-change rate(AIPO-CR) method is also applicable effectively to optimize iteration for acceleration of calculation.

Enhancement Techniques for GPU-Based Rendering of Participating Media (GPU 기반 반투과 매체 렌더링의 향상 기법)

  • Cha, Deuk-Hyun;Yi, Yong-Il;Ihm, In-Sung
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.12
    • /
    • pp.1165-1176
    • /
    • 2010
  • In order to realistically visualize such participating media as cloud, smoke, and gas, the light transport process must be physically simulated inside the media. While it is known that this process is well described physically through the volume rendering equation, it usually takes a great deal of computation time for obtaining high-precision solutions. Recently, GPU-based, fast rendering methods have been proposed for the realistic simulation of participating media, however, there still remain several problems to be resolved. In this article, we describe our rendering techniques applied to enhance the performances and features of our GPU-assisted participating media renderer, and analyze how such efforts have actually improved the renderer. The presented techniques will be effectively used in volume renderers for creating various digital contents in the special effects industries.

GPU-based Adaptive LOD control for Quadtree-Based Terrain Rendering (사진트리 기반 지형렌더링을 위한 GPU기반의 적응형 상세단계 조정 방법)

  • Choi, In-Ji;Shin, Byeong-Seok
    • Journal of Korea Game Society
    • /
    • v.8 no.3
    • /
    • pp.61-68
    • /
    • 2008
  • Quadtree-based terrain visualization methods have been used in a lot of applications. However, because most procedures are performed on the CPU, the rendering speed is slow in comparison to methods using GPU. In this paper, we present a quadtree-based terrain visualization method working on the GPU with specially designed data structure, error-texture and LOD-texture, and block-based acceleration method. In preprocessing step, we calculate errors in world space and store them to error-texture. In rendering step, we examine projected errors of error-texture and choose the detail level, then store the projected errors to LOD-texture. View frustum culling is performed as block unit using the values of error-texture and LOD-texture. This method reduces CPU load and performs time consuming jobs such as LOD selection and view frustum culling.

  • PDF

An efficient acceleration algorithm of GPU ray tracing using CUDA (CUDA를 이용한 효과적인 GPU 광선추적 가속 알고리즘)

  • Ji, Joong-Hyun;Yun, Dong-Ho;Ko, Kwang-Hee
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.469-474
    • /
    • 2009
  • This paper proposes an real time ray tracing system using optimized kd-tree traversal environment and ray/triangle intersection algorithm. The previous kd-tree traversal algorithms search for the upper nodes in a bottom-up manner. In a such way we need to revisit the already visited parent node or use redundant memory after failing to find the intersected primitives in the leaf node. Thus ray tracing for relatively complex scenes become more difficult. The new algorithm contains stacks implemented on GPU's local memory on CUDA framework, thus elegantly eliminate the problems of previous algorithms. After traversing the node we perform the latest CPU-based ray/triangle intersection algorithm 'Plucker coordinate test', which is further accelerated in massively parallel thanks to CUDA. Plucker test can drastically reduce the computational costs since it does not use barycentric coordinates but only simple test using the relations between a ray and the triangle edges. The entire system is consist of a single ray kernel simply and implemented without introduction of complicated synchronization or ray packets. Consequently our experiment shows the new algorithm can is roughly twice as faster as the previous.

  • PDF

Digital Image based Real-time Sea Fog Removal Technique using GPU (GPU를 이용한 영상기반 고속 해무제거 기술)

  • Choi, Woon-sik;Lee, Yoon-hyuk;Seo, Young-ho;Choi, Hyun-jun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.12
    • /
    • pp.2355-2362
    • /
    • 2016
  • Seg fog removal is an important issue concerned by both computer vision and image processing. Sea fog or haze removal is widely used in lots of fields, such as automatic control system, CCTV, and image recognition. Color image dehazing techniques have been extensively studied, and expecially the dark channel prior(DCP) technique has been widely used. This paper propose a fast and efficient image prior - dark channel prior to remove seg-fog from a single digital image based on the GPU. We implement the basic parallel program and then optimize it to obtain performance acceleration with more than 250 times. While paralleling and the optimizing the algorithm, we improve some parts of the original serial program or basic parallel program according to the characteristics of several steps. The proposed GPU programming algorithm and implementation results may be used with advantages as pre-processing in many systems, such as safe navigation for ship, topographical survey, intelligent vehicles, etc.