• Title/Summary/Keyword: Pixel cache

Search Result 16, Processing Time 0.02 seconds

Proposal of 3D Graphic Processor Using Multi-Access Memory System (Multi-Access Memory System을 이용한 3D 그래픽 프로세서 제안)

  • Lee, S-Ra-El;Kim, Jae-Hee;Ko, Kyung-Sik;Park, Jong-Won
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.4
    • /
    • pp.119-128
    • /
    • 2019
  • Due to the nature of the 3D graphics processor system, many mathematical calculations are required and parallel processing research using GPU (Graphics Processing Unit) is being performed for high-speed processing. In this paper, we propose a 3D graphics processor using MAMS, a parallel processor that does not use cache memory, to solve the GPU problem of increasing bandwidth caused by cache memory miss and the problem that 3D shader processing speed is not constant. The 3D graphics processor using MAMS proposed in this paper designed Vertex shader, Pixel shader, Tiling and Rasterizing structure using DirectX command analysis, the FPGA(Xilinx Virtex6@100MHz) board for MAMS was constructed and designed using Verilog. We compared the processing time of the developed FPGA (100Mhz) and nVidia GeForce GTX 660 (980Mhz), the processing time using GTX 660 was not constant and suing MAMS was constant.

Image-Based Relighting Rendering System (영상 기반 실시간 재조명 렌더링 시스템)

  • Kim, Soon-Hyun;Lee, Joo-Haeng;Kyung, Min-Ho
    • Journal of the HCI Society of Korea
    • /
    • v.2 no.1
    • /
    • pp.25-31
    • /
    • 2007
  • We develop an interactive relighting renderer allowing camera view changes based on a deep-frame buffer approach. The renderer first caches the rendering parameters for a given 3D scene in an auxiliary buffer with the same size of the output image. The rendering parameters independent from light changes are selected from the shading models used for shading pixels. Next, as the user interactively edits one light at one time, the relighting renderer instantly re-shades each pixel by updating the contribution of the changed light with the shading parameters cached in the deep-frame buffer. When the camera moves, the cache values should be re-computed because the currently cached values become obsolete. We present a novel method to synthesize them quickly from the cache images of the user specified cameras by using an image-based technique. This computations are all performed on GPU to achieve real-time performance.

  • PDF

A Pixel Pipeline Architecture with Effective Visibility Test for 3D Graphics Accelerators (향상된 가시성 검사를 수행하는 3차원 그래픽 가속기의 픽셀 파이프라인 구조)

  • Kim, Il-San;Park, Woo-Chan;Park, Jin-Hong;Han, Tack-Don
    • Journal of Korea Game Society
    • /
    • v.7 no.3
    • /
    • pp.31-38
    • /
    • 2007
  • In this paper, we proposed an effective visibility test architecture with improving the mid-texturing architecture. The proposed architecture uses the property of fragments that the visibility of adjacent fragments is identical, and performs only a single visibility test per fragment. To compare with the mid-texturing architecture, simulation results show that the bandwidth requirements and the cell area of the depth cache in the proposed architecture are reduce by 25% and 34%, respectively, in exchange for less than 5% performance decline.

  • PDF

A Design of Fractional Motion Estimation Engine with 4×4 Block Unit of Interpolator & SAD Tree for 8K UHD H.264/AVC Encoder (8K UHD(7680×4320) H.264/AVC 부호화기를 위한 4×4블럭단위 보간 필터 및 SAD트리 기반 부화소 움직임 추정 엔진 설계)

  • Lee, Kyung-Ho;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.6
    • /
    • pp.145-155
    • /
    • 2013
  • In this paper, we proposed a $4{\times}4$ block parallel architecture of interpolation for high-performance H.264/AVC Fractional Motion Estimation in 8K UHD($7680{\times}4320$) video real time processing. To improve throughput, we design $4{\times}4$ block parallel interpolation. For supplying the $10{\times}10$ reference data for interpolation, we design 2D cache buffer which consists of the $10{\times}10$ memory arrays. We minimize redundant storage of the reference pixel by applying the Search Area Stripe Reuse scheme(SASR), and implement high-speed plane interpolator with 3-stage pipeline(Horizontal Vertical 1/2 interpolation, Diagonal 1/2 interpolation, 1/4 interpolation). The proposed architecture was simulated in 0.13um standard cell library. The gate count is 436.5Kgates. The proposed H.264/AVC Fractional Motion Estimation can support 8K UHD at 30 frames per second by running at 187MHz.

A Design of 4×4 Block Parallel Interpolation Motion Compensation Architecture for 4K UHD H.264/AVC Decoder (4K UHD급 H.264/AVC 복호화기를 위한 4×4 블록 병렬 보간 움직임보상기 아키텍처 설계)

  • Lee, Kyung-Ho;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.5
    • /
    • pp.102-111
    • /
    • 2013
  • In this paper, we proposed a $4{\times}4$ block parallel architecture of interpolation for high-performance H.264/AVC Motion Compensation in 4K UHD($3840{\times}2160$) video real time processing. To improve throughput, we design $4{\times}4$ block parallel interpolation. For supplying the $9{\times}9$ reference data for interpolation, we design 2D cache buffer which consists of the $9{\times}9$ memory arrays. We minimize redundant storage of the reference pixel by applying the Search Area Stripe Reuse scheme(SASR), and implement high-speed plane interpolator with 3-stage pipeline(Horizontal Vertical 1/2 interpolation, Diagonal 1/2 interpolation, 1/4 interpolation). The proposed architecture was simulated in 0.13um standard cell library. The maximum operation frequency is 150MHz. The gate count is 161Kgates. The proposed H.264/AVC Motion Compensation can support 4K UHD at 72 frames per second by running at 150MHz.

Two Efficient Methods for Generating Depth-of-Field (효율적인 피사계 심도 생성을 위한 두 가지 기법)

  • Suh, Young-Seon;Ihm, In-Sung
    • Journal of the Korea Computer Graphics Society
    • /
    • v.14 no.3
    • /
    • pp.31-46
    • /
    • 2008
  • The depth of field is the range that the objects inside of this range treated to be focused. Objects that are placed out of this range are out of focus and become blurred. In computer graphics, generating depth of field effects gives a great reality to rendered images. The previous researches on the depth of field in computer graphics can be divided into two major categories. One of them is the distributed ray tracing that samples the lens area against each pixel. It is possible to obtain precise results without noise if enough number of samples are taken. However, to make a good result, a great number of samples are needed, resulting in an enormous timing requirement. The other approach is the method that approximates depth of field effect by post-processing an image and its depth values computed using a pin-hole camera. Though the second technique is not that physically correct like distributed ray tracing, many approaches which using this idea have been introduced because it is much faster than the first approach. But the post-processing have some limitations because of the lack of ray information. In this paper, we first present an improvement technique that corrects the previous post-processing methods and then propose another one that accelerates the distributed ray tracing by using a radiance caching method.

  • PDF