• Title/Summary/Keyword: Pixel cache

Search Result 16, Processing Time 0.023 seconds

A Pixel Cache Architecture with Selective Loading Scheme based on Z-test (깊이 검사 결과에 의한 선택적 적재 방법을 가지는 픽셀 캐쉬 구조)

  • 이길환;박우찬;김일산;한탁돈
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.10
    • /
    • pp.579-585
    • /
    • 2003
  • Recently most of 3D graphics rendering Processors have the pixel cache storing depth data and color data to reduce the memory latency and the bandwidth requirement. In this paper, we propose the effective pixel cache for improving the performance of a rendering processor. The proposed cache system stores the depth data selectively based on the result of Z-test and the color data are stored into the auxiliary buffer. Simulation results show that the 16Kbyte proposed cache system provides better performance than the 32Kbyte conventional cache.

The Early Write Back Scheme For Write-Back Cache (라이트 백 캐쉬를 위한 빠른 라이트 백 기법)

  • Chung, Young-Jin;Lee, Kil-Whan;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.11
    • /
    • pp.101-109
    • /
    • 2009
  • Generally, depth cache and pixel cache of 3D graphics are designed by using write-back scheme for efficient use of memory bandwidth. Also, there are write after read operations of same address or only write operations are occurred frequently in 3D graphics cache. If a cache miss is detected, an access to the external memory for write back operation and another access to the memory for handling the cache miss are operated simultaneously. So on frequent cache miss situations, as the memory access bandwidth limited, the access time of the external memory will be increased due to memory bottleneck problem. As a result, the total performance of the processor or the IP will be decreased, also the problem will increase peak power consumption. So in this paper, we proposed a novel early write back cache architecture so as to solve the problems issued above. The proposed architecture controls the point when to access the external memory as to copy the valid data block. And this architecture can improve the cache performance with same hit ratio and same capacity cache. As a result, the proposed architecture can solve the memory bottleneck problem by preventing intensive memory accesses. We have evaluated the new proposed architecture on 3D graphics z cache and pixel cache on a SoC environment where ARM11, 3D graphic accelerator and various IPs are embedded. The simulation results indicated that there were maximum 75% of performance increase when using various simulation vectors.

Image Cache for FPGA-based Real-time Image Warping (FPGA 기반 실시간 영상 워핑을 위한 영상 캐시)

  • Choi, Yong Joon;Ryoo, Jung Rae
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.6
    • /
    • pp.91-100
    • /
    • 2016
  • In FPGA-based real-time image warping systems, image caches are utilized for fast readout of image pixel data and reduction of memory access rate. However, a cache algorithm for a general computer system is not suitable for real-time performance because of time delays from cache misses and on-line computation complexity. In this paper, a simple image cache algorithm is presented for a FPGA-based real-time image warping system. Considering that pixel data access sequence is determined from the 2D coordinate transformation and repeated identically at every image frame, a cache load sequence is off-line programmed to guarantee no cache miss condition, and reduced on-line computation results in a simple cache controller. An overall system structure using a FPGA is presented, and experimental results are provided to show accuracy and validity of the proposed cache algorithm.

Texture Cache with Automatical Index Splitting Based on Texture Size (텍스처의 크기에 따라 인덱스를 자동 분할하는 텍스처 캐시)

  • Kim, Jin-Woo;Park, Young-Jin;Kim, Young-Sik;Han, Tack-Don
    • Journal of Korea Game Society
    • /
    • v.8 no.2
    • /
    • pp.57-68
    • /
    • 2008
  • Texture Mapping is a technique for adding realism to an image in 3D graphics Chip. Bilinear filtering mode of this technique needs accesses of 4 texels to process one pixel. In this paper we analyzed the access pattern of texture, and proposed the high performance texture cache which can access 4 texels simultaneously. We evaluated using simulation results of 3D game(Quake 3, Unreal Tournament 2004). Simulation results show that proposed texture cache has high performance on the case where physical size is less then or equal 8KBytes.

  • PDF

Image Cache Algorithm for Real-time Implementation of High-resolution Color Image Warping (고해상도 컬러 영상 워핑의 실시간 구현을 위한 영상 캐시 알고리즘)

  • Lee, You Jin;Ryoo, Jung Rae
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.22 no.8
    • /
    • pp.643-649
    • /
    • 2016
  • This paper presents a new image cache algorithm for real-time implementation of high-resolution color image warping. The cache memory is divided into four cache memory modules for simultaneous readout of four input image pixels in consideration of the color filter array (CFA) pattern of an image sensor and CFA image warping. In addition, a pipeline structure from the cache memory to an interpolator is shown to guarantee the generation of an output image pixel at each system clock cycle. The proposed image cache algorithm is applied to an FPGA-based real-time color image warping, and experimental results are presented to show the validity of the proposed method.

David II: A new architecture for parallel rendering processors with effective memory system (David II: 효과적인 메모리 시스템을 가지는 병렬 렌더링 프로세서)

  • Lee, Kil-Whan;Park, Woo-Chan;Kim, Il-San;Han, Tack-Don
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2004.05a
    • /
    • pp.1655-1658
    • /
    • 2004
  • Current rendering processors are organized mainly to process a triangle as fast as possible and recently parallel 3D rendering processors, which can process multiple triangles in parallel with multiple rasterizers, begin to appear. For high performance in processing triangles, it is desirable for each rasterizer have its own local pixel cache. However, the consistency problem may occur in accessing the data at the same address simultaneously by more than one rasterizer. In this paper, we propose a parallel rendering processor architecture, called DAVID II, resolving such consistency problem effectively. Moreover, the proposed architecture reduces the latency due to a pixel cache miss significantly. The experimental results show that DAVID II achieves almost linear speedup at best case even in sixteen rasterizers.

  • PDF

Design of a Parallel Rendering Processor Architecture with Effective Memory System (효과적인 메모리 구조를 갖는 병렬 렌더링 프로세서 설계)

  • Park Woo-Chan;Yoon Duk-Ki;Kim Kyoung-Su
    • The KIPS Transactions:PartA
    • /
    • v.13A no.4 s.101
    • /
    • pp.305-316
    • /
    • 2006
  • Current rendering processors are organized mainly to process a triangle as fast as possible and recently parallel 3D rendering processors, which can process multiple triangles in parallel with multiple rasterizers, begin to appear. For high performance in processing triangles, it is desirable for each rasterizer have its own local pixel cache. However, the consistency problem may occur in accessing the data at the same address simultaneously by more than one rasterizer. In this paper, we propose a parallel rendering processor architecture resolving such consistency problem effectively. Moreover, the proposed architecture reduces the latency due to a pixel cache miss significantly. For the above two goals, effective memory organizations including a new pixel cache architecture are presented. The experimental results show that the proposed architecture achieves almost linear speedup at best case even in sixteen rasterizers.

Relighting 3D Scenes with a Continuously Moving Camera

  • Kim, Soon-Hyun;Kyung, Min-Ho;Lee, Joo-Haeng
    • ETRI Journal
    • /
    • v.31 no.4
    • /
    • pp.429-437
    • /
    • 2009
  • This paper proposes a novel technique for 3D scene relighting with interactive viewpoint changes. The proposed technique is based on a deep framebuffer framework for fast relighting computation which adopts image-based techniques to provide arbitrary view-changing. In the preprocessing stage, the shading parameters required for the surface shaders, such as surface color, normal, depth, ambient/diffuse/specular coefficients, and roughness, are cached into multiple deep framebuffers generated by several caching cameras which are created in an automatic manner. When the user designs the lighting setup, the relighting renderer builds a map to connect a screen pixel for the current rendering camera to the corresponding deep framebuffer pixel and then computes illumination at each pixel with the cache values taken from the deep framebuffers. All the relighting computations except the deep framebuffer pre-computation are carried out at interactive rates by the GPU.

Performance Analysis of Texture / Pixel Cache in 3D Graphics Rasterization (3차원 그래픽 래스터라이제이션에서의 텍스쳐/픽셀 캐쉬 성능분석)

  • 김일산;박기호;이길환;박우찬;한탁돈
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04a
    • /
    • pp.25-27
    • /
    • 2002
  • 본 논문에서는 3차원 그래픽의 래스터라이제이션 단계에서 발생하는 메모리 트래픽 문제를 해결하기 위해 사용되는 텍스처 및 픽셀 캐쉬에 대한 성능을 분석하였다. 이를 위해 화면의 해상도, 컬러 정보, 깊이 정보 및 캐쉬 구성의 변화에 따른 이들 캐쉬의 성능변화를 살펴보았으며 실험결과 텍스처 캐쉬와 픽셀 캐쉬의 설계 시에 블록 크기에 의한 영향이 매우 중요함을 알 수 있었다. 특히 픽셀 캐쉬의 경우에는 시간적 지역성은 거의 없으며 매우 큰 공간적 지역성을 보이므로 이를 잘 반영할 수 있는 캐쉬 구조가 필요하다.

  • PDF

A architecture for parallel rendering processor with by effective memory organization (효과적인 메모리 구조를 갖는 병렬 렌더링 프로세서 구조)

  • Kim, Kyung-Su;Yoon, Duk-Ki;Kim, Il-San;Park, Woo-Chan
    • Journal of Korea Game Society
    • /
    • v.5 no.3
    • /
    • pp.39-47
    • /
    • 2005
  • Current rendering processors are organized mainly to process a triangle as fast as possible and recently parallel 3D rendering processors, which can process multiple triangles in parallel with multiple rasterizers, begin to appear. For high performance in processing triangles, it is desirable for each rasterizer have its own local pixel cache. However, the consistency problem may occur in accessing the data at the same address simulaneously by more than one rasterizer. In this paper, we propose a parallel rendering processor architecture resolving such consistency problem effectively. Moreover, the proposed architecture reduces the latency due to a pixel cache miss significantly. The experimental results show that proposed architecture achieves almost linear speedup at best case even in sixteen rasterizer

  • PDF