Search | Korea Science

Design of Special Function Unit for Vectorized SIMD Programmable Unified Shader (벡터화된 SIMD 프로그램어블 통합 셰이더를 위한 특수 함수 유닛 설계)

Jung, Jin-Ha;Kim, Kyeong-Seob;Yun, Jeong-Hee;Seo, Jang-Won;Choi, Sang-Bang
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.47 no.5
- /
- pp.56-70
- /
- 2010
Rendering technique generating 2 dimensional image to give reality and high performance graphical processor for efficient processing of massive data are necessary to support realistic 3 dimensional graphical image. Recently, graphical hardwares have evolved rapidly. This enables high quality rendering effect that we were unable to process in realtime. Improving shading technique enabled us to render realistic images but still much time is required for this process. Multiple operational units are being integrated in a graphical processor for effective floating point operation using massive data to process almost real looking images. In this paper, we have designed and implemented a special functional unit to support high quality 3 dimensional computer graphic image on programmable integrated shader processor. We have done evaluation through functional level simulation of designed special functional unit. Hardware resource usage rate and execution speed are measured implementing directly on FPGA Virtex-4(xc4vlx200).
PDF KSCI

Control Unit Design and Implementation for SIMD Programmable Unified Shader (SIMD 프로그래머블 통합 셰이더를 위한 제어 유닛 설계 및 구현)

Kim, Kyeong-Seob;Lee, Yun-Sub;Yu, Byung-Cheol;Jung, Jin-Ha;Choi, Sang-Bang
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.48 no.7
- /
- pp.37-47
- /
- 2011
Real picture like high quality computer graphic is widely used in various fields and shader processor, a key part of a graphic processor, has been advanced to programmable unified shader. However, The existing graphic processors have been optimized to commercial algorithms, so development of an algorithm which is not based on it requires an independent shader processor. In this paper, we have designed and implemented a control unit to support high quality 3 dimensional computer graphic image on programmable integrated shader processor. We have done evaluation through functional level simulation of designed control unit. Hardware resource usage rate are measured by implementing directly on FPGA Virtex-4 and execution speed are verified by applying ASIC library. the result of an evaluation shows that the control unit has the commands more about 1.5 times compared to the other shader processors that is a behavior similar to the control unit and with a number of processing units used in a shader processor, compared with the other processors, overall performance of the control unit is improved about 3.1 GFLOPS.
PDF KSCI

Design of a Variable-Length Instruction based on a OpenGL ES 2.0 API (OpenGL ES 2.0 API 기반 가변길이 명령어 설계)

Lee, Kwang-Yeob
- Journal of IKEEE
- /
- v.12 no.2
- /
- pp.118-123
- /
- 2008
The Khronos group releases OpenGL ES 2.0 API specification bringing streamlined shader programming to graphics processor of embedded system. For this reason, the mobile devices have need of graphics processor for supporting a OpenGL ES 2.0 API. We need to extend instruction`s length to support OpenGLES 2.0 API, so it needs more memory size. In this paper, we propose a new instruction format that offers availability for use the instructions. This proposed instruction adopt a variable length method and unit instruction architecture. This proposed instruction architecture that support to OpenGLES 2.0 API has consist of 32bit unit instructions up to 4 which can be combined for embellishing each other. Therefore, it can execute flexible instruction combination and reduce waste of instruction fields.
PDF

The Design of Geometry Processor for 3D Graphics (3차원 그래픽을 위한 Geometry 프로세서의 설계)

Jeong, Cheol-Ho;Park, Woo-Chan;Kim, Shin-Dug;Han, Tack-Don
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.1
- /
- pp.252-265
- /
- 2000
In this thesis, the analysis of data processing method and the amount of computation in the whole geometry processing is conducted step by step. Floating-point ALU design is based on the characteristics of geometry processing operation. The performance of the devised ALU fitting with the geometry processing operation is analyzed by simulation after the description of the proposed ALU and geometry processor. The ALU designed in the paper can perform three types of floating-point operation simultaneously-addition/subtraction, multiplication, division. As a result, the 23.5% of improvement is achieved by that floating-point ALU for the whole geometry processing and in the floating-point division and square root operation, there is another 23% of performance gain with adding area-performance efficient SRT divisor.
PDF

Design of a Variable-Length Instruction for the Effective Usability Instruction in 3D Graphics Processor (3D 그래픽 프로세서에서 효율적인 명령어를 위한 가변길이 명령어 설계)

Kim, Woo-Young;Lee, Bo-Haeng;Lee, Kwang-Yeob;Kwak, Jae-Chang
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2008.05a
- /
- pp.281-284
- /
- 2008
Recently, Khronos institude OpenGL ES 2.0 API for support Shader 3.0 model that can possible variable graphic processing. For this reason, the mobile device have need of supporting processor for a shader 3.0 model. We should extend instruction's length to support OpenGL ES 2.0 API, so we need more memory size. In this paper, we propose a new instruction form that adopted variable length and unit instruction architecture. This proposed instruction architecture that support to Shader 3.0 model has consist of 32bit unit instructions up to 4 which can be combined for embellishing each other. Therefore, it can execute flexible instruction combination and reduce waste of instruction fields.
PDF

Fast GPU Implementation for the Solution of Tridiagonal Matrix Systems (삼중대각행렬 시스템 풀이의 빠른 GPU 구현)

Kim, Yong-Hee;Lee, Sung-Kee
- Journal of KIISE:Computer Systems and Theory
- /
- v.32 no.11_12
- /
- pp.692-704
- /
- 2005
With the improvement of computer hardware, GPUs(Graphics Processor Units) have tremendous memory bandwidth and computation power. This leads GPUs to use in general purpose computation. Especially, GPU implementation of compute-intensive physics based simulations is actively studied. In the solution of differential equations which are base of physics simulations, tridiagonal matrix systems occur repeatedly by finite-difference approximation. From the point of view of physics based simulations, fast solution of tridiagonal matrix system is important research field. We propose a fast GPU implementation for the solution of tridiagonal matrix systems. In this paper, we implement the cyclic reduction(also known as odd-even reduction) algorithm which is a popular choice for vector processors. We obtained a considerable performance improvement for solving tridiagonal matrix systems over Thomas method and conjugate gradient method. Thomas method is well known as a method for solving tridiagonal matrix systems on CPU and conjugate gradient method has shown good results on GPU. We experimented our proposed method by applying it to heat conduction, advection-diffusion, and shallow water simulations. The results of these simulations have shown a remarkable performance of over 35 frame-per-second on the 1024x1024 grid.
PDF KSCI

The Design of the Perspective Texture Mapping in Rasterizer Merged Frame Buffer Technology (래스터라이저-프레임버퍼 혼합 구조에서의 원근투영 텍스쳐 매핑의 설계)

Lee, Seung-Gi;Park, Woo-Chan;Han, Tack-Don
- Proceedings of the Korea Information Processing Society Conference
- /
- 2000.04a
- /
- pp.293-298
- /
- 2000
최근 3차원 그래픽스 분야는 기존의 단순 이미지의 처리가 아닌 보다 나은 화질과 보다 많은 기법의 도입이 요구되어 지고 있다. 이에 본 논문에서는 가장 기본적인 실감영상의 표현 기법인 텍스쳐 매핑 기법에 대하여 논하였고, 3차원의 객체 공간에서 2차원의 스크린 공간으로의 변환으로 인해 생길 수 있는 문제점과 렌더링 알고리즘에 대해 분석하였으며, 이에 부합하는 렌더링 시스템을 설계, 분석하였다. 또한 본 시스템은 고성능 3차원 그래픽 처리를 위하여 채택되어지고 있는 프로세서-메모리 집적 방식을 이용, 래스터라이징 유닛과 프레임버퍼를 단일 칩으로 구성하여 렌더링과 텍스쳐 매핑 과정에서 발생할 수 있는 지연현상을 제거하였다.
PDF

Parallel Implementation and Performance Evaluation of the SIFT Algorithm Using a Many-Core Processor (매니코어 프로세서를 이용한 SIFT 알고리즘 병렬구현 및 성능분석)

Kim, Jae-Young;Son, Dong-Koo;Kim, Jong-Myon;Jun, Heesung
- Journal of the Korea Society of Computer and Information
- /
- v.18 no.9
- /
- pp.1-10
- /
- 2013
In this paper, we implement the SIFT(Scale-Invariant Feature Transform) algorithm for feature point extraction using a many-core processor, and analyze the performance, area efficiency, and system area efficiency of the many-core processor. In addition, we demonstrate the potential of the proposed many-core processor by comparing the performance of the many-core processor with that of high-performance CPU and GPU(Graphics Processing Unit). Experimental results indicate that the accuracy result of the SIFT algorithm using the many-core processor was same as that of OpenCV. In addition, the many-core processor outperforms CPU and GPU in terms of execution time. Moreover, this paper proposed an optimal model of the SIFT algorithm on the many-core processor by analyzing energy efficiency and area efficiency for different octave sizes.
https://doi.org/10.9708/jksci.2013.18.9.001 인용 PDF KSCI

Design of Square Root and Inverse Square Root Arithmetic Units for Mobile 3D Graphic Processing (모바일 3차원 그래픽 연산을 위한 제곱근 및 역제곱근 연산기 구조 및 설계)

Lee, Chan-Ho
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.46 no.3
- /
- pp.20-25
- /
- 2009
We propose hardware architecture of floating-point square root and inverse square root arithmetic units using lookup tables. They are used for lighting engines and shader processor for 3D graphic processing. The architecture is based on Taylor series expansion and consists of lookup tables and correction units so that the size of look-up tables are reduced. It can be applied to 32 bit floating point formats of IEEE-754 and reduced 24 bit floating point formats. The square root and inverse square root arithmetic units for 32 bit and 24 bit floating format number are designed as the proposed architecture. They can operation in a single cycle, and satisfy the precision of $10^{-5}$ required by OpenGL 1.x ES. They are designed using Verilog-HDL and the RTL codes are verified using an FPGA.
PDF KSCI

IEEE-754 Floating-Point Divider for Embedded Processors (내장형 프로세서를 위한 IEEE-754 고성능 부동소수점 나눗셈기의 설계)

Jeong, Jae-Won;Hong, In-Pyo;Jeong, Woo-Kyong;Lee, Yong-Surk
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.39 no.7
- /
- pp.66-73
- /
- 2002
As floating-point operations become widely used in various applications such as computer graphics and high-definition DSP, the needs for fast division become increased. However, conventional floating-point dividers occupy a large hardware area, and bring bottle-becks to the entire floating-point operations. In this paper, a high-performance and small-area floating-point divider, which is suitable for embedded processors, is designed using he series expansion algorithm. The algorithm is selected to utilize two MAC(Multiply-ACcumulate) units for quadratic convergence to the correct quotient. The two MAC units for SIMD-DSP features are shared and the additional area for the division only is very small. The proposed divider supports all rounding modes defined by IEEE 754 standard, and error estimations are performed for appropriate precision.
PDF KSCI

Search Result 11, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)