• Title/Summary/Keyword: graphics hardware

Search Result 198, Processing Time 0.027 seconds

Hardware-based Visibility Preprocessing using a Point Sampling Method (점 샘플링 방법을 이용한 하드웨어 기반 가시성 전처리 알고리즘)

  • Kim, Jaeho;Wohn, Kwangyun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.8 no.2
    • /
    • pp.9-14
    • /
    • 2002
  • In cases of densely occluded urban scenes, it is effective to determine the visibility of scenes, since only small parts of the scene are visible from a given cell. In this paper, we introduce a new visibility preprocessing method that efficiently computes potentially visible objects for volumetric cells. The proposed method deals with general 3D polygonal models and invisible objects jointly blocked by multiple occluders. The proposed approach decomposes volume visibility into a set of point visibilities, and then computes point visibility using hardware visibility queries, in particular HP_occlusion_test and NV_occlusion_query. We carry out experiments on various large-scale scenes, and show the performance of our algorithm.

  • PDF

Voronoi Diagram Computation for a Molecule Using Graphics Hardware (그래픽 하드웨어를 이용한 분자용 보로노이 다이어그램 계산)

  • Lee, Jung-Eun;Baek, Nak-Hoon;Kim, Ku-Jin
    • The KIPS Transactions:PartA
    • /
    • v.19A no.4
    • /
    • pp.169-174
    • /
    • 2012
  • We present an algorithm that computes a 3 dimensional Voronoi diagram for a protein molecule in this paper. The molecule is represented as a set of spheres with van der Waals radii. The Voronoi diagram is constructed in the 3D space by finding the voxels containing it. For the feasibility of the computation, we represent the molecule as a BVH (bounding volume hierarchy), and our system is accelerated by modern graphics hardware with CUDA programming support. Compared to single-core CPU implementations, experimental results show 323 times faster performance in the computation time, when the space is partitioned into $2^{24}$ voxels.

Accelerated Volume Rendering based on 3D Texture Mapping Hardware using Normal Blending (3D 텍스쳐 매핑 하드웨어 하에서 법선 벡터 블렌딩을 이용한 가속화된 볼륨 렌더링)

  • Yun, Seong-Ui;Sin, Yeong-Gil
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.4
    • /
    • pp.181-187
    • /
    • 2001
  • 본 논문에서는 3D 텍스쳐 매핑 하드웨어(texture mapping hardware)하에서 OpenGL를 이용하여 빠른 추출(classification) 및 음영처리(shading)를 가능하게 하는 직접 볼륨 렌더링(direct volume rendering) 방법을 제안한다. 추출과정을 위해 lookup table을 통해서 볼륨 데이터의 밀도값(density)으로부터 불투명도(opacity)값을 얻어내고, 법선 벡터 블렌딩(normal blending)방법을 제안하여 볼륨 크기에 상관없이 최종 이미지에서만 음영 처리 연산을 수행한다. 본 논문에서 제시된 볼륨 렌더링의 전과정이 그래픽스 하드웨어(graphics hardware)에서 이뤄지면, 음영처리 연산의 복잡도 감소로 인하여 상호 대화적인 볼륨 렌더링이 가능하다.

  • PDF

KAWS: Coordinate Kernel-Aware Warp Scheduling and Warp Sharing Mechanism for Advanced GPUs

  • Vo, Viet Tan;Kim, Cheol Hong
    • Journal of Information Processing Systems
    • /
    • v.17 no.6
    • /
    • pp.1157-1169
    • /
    • 2021
  • Modern graphics processor unit (GPU) architectures offer significant hardware resource enhancements for parallel computing. However, without software optimization, GPUs continuously exhibit hardware resource underutilization. In this paper, we indicate the need to alter different warp scheduler schemes during different kernel execution periods to improve resource utilization. Existing warp schedulers cannot be aware of the kernel progress to provide an effective scheduling policy. In addition, we identified the potential for improving resource utilization for multiple-warp-scheduler GPUs by sharing stalling warps with selected warp schedulers. To address the efficiency issue of the present GPU, we coordinated the kernel-aware warp scheduler and warp sharing mechanism (KAWS). The proposed warp scheduler acknowledges the execution progress of the running kernel to adapt to a more effective scheduling policy when the kernel progress attains a point of resource underutilization. Meanwhile, the warp-sharing mechanism distributes stalling warps to different warp schedulers wherein the execution pipeline unit is ready. Our design achieves performance that is on an average higher than that of the traditional warp scheduler by 7.97% and employs marginal additional hardware overhead.

Design of an Effective Bump Mapping Hardware Architecture Using Angular Operation (각 연산을 이용한 효과적인 범프 매핑 하드웨어 구조 설계)

  • 이승기;박우찬;김상덕;한탁돈
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.11
    • /
    • pp.663-674
    • /
    • 2003
  • Bump mapping is a technique that represents the detailed parts of the object surface, such as a perturberance of the skin of a peanut, using the geometry mapping without complex modeling. However, the hardware implementation for bump mapping is considerable, because a large amount of per pixel computation, including the normal vector shading, is required. In this paper, we propose a new bump mapping algorithm using the polar coordinate system and its hardware architecture. Compared with other existing architectures, our approach performs bump mapping effectively by using a new vector rotation method for transformation into the reference space and minimizing illumination calculation. Consequently, our proposed architecture reduces a large amount of computation and hardware requirements.

A Real-time Single-Pass Visibility Culling Method Based on a 3D Graphics Accelerator Architecture (실시간 단일 패스 가시성 선별 기법 기반의 3차원 그래픽스 가속기 구조)

  • Choo, Catherine;Choi, Moon-Hee;Kim, Shin-Dug
    • The KIPS Transactions:PartA
    • /
    • v.15A no.1
    • /
    • pp.1-8
    • /
    • 2008
  • An occlusion culling method, one of visibility culling methods, excludes invisible objects or triangles which are covered by other objects. As it reduces computation quantity, occlusion culling is an effective method to handle complex scenes in real-time. But an existing common occlusion culling method, such as hardware occlusion query method, sends objects' data twice to GPU and this causes processing overheads once for occlusion culling test and the other is for rendering. And another existing hardware occlusion culling method, VCBP, can test objects' visibility quickly, but it neither test bounding volume nor return test result to application stage. In this paper, we propose a single pass occlusion culling method which uses temporal and spatial coherency, with effective occlusion culling hardware architecture. In our approach, the hardware performs occlusion culling test rapidly with cache on the rasterization stage where triangles are transformed into fragments. At the same time, hardware sends each primitive's visibility information to application stage. As a result, the application stage reduces data transmission quantity by excluding covered objects using the visibility information on previous frame and hierarchical spatial tree. Our proposed method improved maximum 44%, minimum 14% compared with S&W method based on hardware occlusion query. And the performance is increased 25% and 17% respectively, compared to maximum and minimum performance of CHC method which is based on occlusion culling method.

Real-Time Shadow Generation Using Image-Based Rendering Technique (영상기반 렌더링 기법을 이용한 실시간 그림자 생성)

  • Lee, Jung-Yeon;Im, In-Seong
    • Journal of the Korea Computer Graphics Society
    • /
    • v.7 no.1
    • /
    • pp.27-35
    • /
    • 2001
  • Shadows are important elements in producing a realistic image. In rendering. generation of the exact shape and position of shadow is crucial in providing the user with visual cues on the scene. While the shadow map technique quickly generates a shadow for the scene wherein objects and light sources are fixed. it gets slow down as they start to move. In this paper. we apply an image-based rendering technique to generate shadows in real-time using graphics hardware. Due to the heavy requirement of storage for a shadow map repository. we use a wavelet-based compression scheme for effective compression. Our method will be efficiently used in generating realistic scenes in many real-time applications such as 3D games and virtual reality systems.

  • PDF

Real-Time Object Segmentation in Image Sequences (연속 영상 기반 실시간 객체 분할)

  • Kang, Eui-Seon;Yoo, Seung-Hun
    • The KIPS Transactions:PartB
    • /
    • v.18B no.4
    • /
    • pp.173-180
    • /
    • 2011
  • This paper shows an approach for real-time object segmentation on GPU (Graphics Processing Unit) using CUDA (Compute Unified Device Architecture). Recently, many applications that is monitoring system, motion analysis, object tracking or etc require real-time processing. It is not suitable for object segmentation to procedure real-time in CPU. NVIDIA provide CUDA platform for Parallel Processing for General Computation to upgrade limit of Hardware Graphic. In this paper, we use adaptive Gaussian Mixture Background Modeling in the step of object extraction and CCL(Connected Component Labeling) for classification. The speed of GPU and CPU is compared and evaluated with implementation in Core2 Quad processor with 2.4GHz.The GPU version achieved a speedup of 3x-4x over the CPU version.

A Reconfigurable Parallel Processor for Efficient Processing of Mobile Multimedia (모바일 멀티미디어의 효율적 처리를 위한 재구성형 병렬 프로세서의 구조)

  • Yoo, Se-Hoon;Kim, Ki-Chul;Yang, Yil-Suk;Roh, Tae-Moon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.10
    • /
    • pp.23-32
    • /
    • 2007
  • This paper proposes a reconfigurable parallel processor architecture which can efficiently implement various multimedia applications, such as 3D graphics, H.264/H.263/MPEG-4, JPEG/JPEG2000, and MP3. The proposed architecture directly connects memories and processors so that memory access time and power consumption are reduced. It supports floating-point operations needed in the geometry stage of 3D graphics. It adopts partitioned SIMD to reduce hardware costs. Conditional execution of instructions is used for easy development of parallel algorithms.

Scalable Graphics Algorithms (스케일러블 그래픽스 알고리즘)

  • Yoon, Sung-Eui
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02c
    • /
    • pp.224-224
    • /
    • 2008
  • Recent advances in model acquisition, computer-aided design, and simulation technologies have resulted in massive databases of complex geometric data occupying multiple gigabytes and even terabytes. In various graphics/geometric applications, the major performance bottleneck is typically in accessing these massive geometric data due to the high complexity of such massive geometric data sets. However, there has been a consistent lower growth rate of data access speed compared to that of computational processing speed. Moreover, recent multi-core architectures aggravate this phenomenon. Therefore, it is expected that the current architecture improvement does not offer the solution to the problem of dealing with ever growing massive geometric data, especially in the case of using commodity hardware. In this tutorial, I will focus on two orthogonal approaches--multi-resolution and cache-coherent layout techniques--to design scalable graphics/geometric algorithms. First, I will discuss multi-resolution techniques that reduce the amount of data necessary for performing geometric methods within an error bound. Second, I will explain cache-coherent layouts that improve the cache utilization of runtime geometric applications. I have applied these two techniques into rendering, collision detection, and iso-surface extractions and, thereby, have been able to achieve significant performance improvement. I will show live demonstrations of view-dependent rendering and collision detection between massive models consisting of tens of millions of triangles on a laptop during the talk.

  • PDF