• Title/Summary/Keyword: Pipeline Processing Structure

Search Result 73, Processing Time 0.025 seconds

A Design of Vector Processing Based 3D Graphics Geometry Processor (벡터 프로세싱 기반의 3차원 그래픽 지오메트리 프로세서 설계)

  • Lee, Jung-Woo;Kim, Ki-Chul
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.989-990
    • /
    • 2006
  • This paper presents a design of 3D Graphics Geometry processor. A geometry processor needs to cope with a large amount of computation and consists of transformation processor and lighting processor. To deal with the huge computation, a vector processing structure based on pipeline chaining is proposed. The proposed geometry processor performs 4.3M vertices/sec at 100MHz using 11 floating-point units.

  • PDF

Geometry Processing using Multi-Core GP-GPU (멀티코어 GP-GPU를 이용한 지오메트리 처리)

  • Lee, Kwang-Yeob;Kim, Chi-Yong
    • Journal of IKEEE
    • /
    • v.14 no.2
    • /
    • pp.69-75
    • /
    • 2010
  • A 3D graphics pipeline is largely divided into geometry stage and rendering stage. In this paper, we propose a method that accelerates a geometry processing in multi-core GP-GPU, using dual-phase structure. It can be improved by parallel data processing using SIMD of GP-GPU, dual-phase structure and memory prefetch. The proposed architecture improves approximately 19% of performance when it use all the features.

Design of Decimal Floating-Point Adder for High Speed Operation with Leading Zero Anticipator (선행 제로 예측기를 이용한 고속 연산 십진 부동소수점 가산기 설계)

  • Yun, Hyoung-Kie;Moon, Dai-Tchul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.2
    • /
    • pp.407-413
    • /
    • 2015
  • In this paper, a DFPA(decimal floating-point adder) designed a pipeline structure that uses a LZA(leading zero anticipator) to reduce critical route to shorten delay to improve the speed of operation processing. The evaluation and verification of performance of proposed DFPA applied the Flowrian tool with simulation and Cyclone III FPGA was set as the target on the Quartus II tool for the synthesis. The proposed method compared and verified to proposed the other method using same input data. As a result, the performance of proposed method is improved 11.2% and 5.9% more than L.K.Wang's method and etc.. Also, it is confirmed that improvement of operation processing speed and reduction of the number of delay elements on critical path.

Hardware Design of High Performance In-loop Filter in HEVC Encoder for Ultra HD Video Processing in Real Time (UHD 영상의 실시간 처리를 위한 고성능 HEVC In-loop Filter 부호화기 하드웨어 설계)

  • Im, Jun-seong;Dennis, Gookyi;Ryoo, Kwang-ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.401-404
    • /
    • 2015
  • This paper proposes a high-performance in-loop filter in HEVC(High Efficiency Video Coding) encoder for Ultra HD video processing in real time. HEVC uses in-loop filter consisting of deblocking filter and SAO(Sample Adaptive Offset) to solve the problems of quantization error which causes image degradation. In the proposed in-loop filter encoder hardware architecture, the deblocking filter and SAO has a 2-level hybrid pipeline structure based on the $32{\times}32CTU$ to reduce the execution time. The deblocking filter is performed by 6-stage pipeline structure, and it supports minimization of memory access and simplification of reference memory structure using proposed efficient filtering order. Also The SAO is implemented by 2-statge pipeline for pixel classification and applying SAO parameters and it uses two three-layered parallel buffers to simplify pixel processing and reduce operation cycle. The proposed in-loop filter encoder architecture is designed by Verilog HDL, and implemented by 205K logic gates in TSMC 0.13um process. At 110MHz, the proposed in-loop filter encoder can support 4K Ultra HD video encoding at 30fps in realtime.

  • PDF

Design and Implementation of Binary Image Normalization Hardware for High Speed Processing (고속 처리를 위한 이진 영상 정규화 하드웨어의 설계 및 구현)

  • 김형구;강선미;김덕진
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.5
    • /
    • pp.162-167
    • /
    • 1994
  • The binary image normalization method in image processing can be used in several fields, Especially, its high speed processing method and its hardware implmentation is more useful, A normalization process of each character in character recognition requires a lot of processing time. Therefore, the research was done as a part of high speed process of OCR (optical character reader) implementation as a pipeline structure with host computer in hardware to give temporal parallism. For normalization process, general purpose CPU,MC68000, was used to implement it. As a result of experiment, the normalization speed of the hardware is sufficient to implement high speed OCR which the recognition speed is over 140 characters per second.

  • PDF

A study on the development of high performance graphics system for simulation (Simulation을 위한 고성능 그래픽 시스템의 개발에 관한 연구)

  • 노갑선;박재현;장래혁;박정우;구경훈;이재영;권욱현
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1992.10a
    • /
    • pp.321-326
    • /
    • 1992
  • In this paper, a high performance graphics system is suggested and its hardware architecture and software structure are described. The developed graphics system is a multi-processing system that uses 6 i860 RISC CPU's and supports PHIGS language in a hardware level. The software is programmed with respect to the graphics pipeline and the software modules are distributed into each processor for the optimization of the performance. The implemented graphics system can draw about 100,000 3D polygons second.

  • PDF

A Study of Modified Parallel Feistel Structure of Data Speed-up DES (DES의 데이터 처리속도 향상을 위한 변형된 병렬 Feistel 구조에 관한 연구)

  • Lee, Seon-Keun;kIM, Hyeoung-Kyun;Kim, Hwan-Yong
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.37 no.12
    • /
    • pp.91-97
    • /
    • 2000
  • With the brilliant development of information communication and the rapid spread of internet, current network communication is carrying several up-to-date functions such as electronic commerce, activation of electro currency or electronic signature and will produce more advanced services in the future. Information communication network such as that electronic commerce would demand the more safe and transparent guard of network, and anticipate the more fast performance of network. In this paper, in order to meet the several demands, DES(data encryption standard) with parallel feistel structure, which feistel structure of the basic structure of DES is transformed into in parallel, is proposed. The existing feistel structure can't use pipeline method for the structural problem of DES itself-the propagation of error. therefore, this modified parallel feistel structure could improve largely the performance of DES which had to have the trade-off relation between data processing speed and data security and in addition a method proposed in SEED having adopted the modified parallel feistel structure shows more excellent secure function and/or fast processing ability. The used CAD Tool use Synopsys Ver. 1999. 10 in both of synthesis and simulation.

  • PDF

A Study on the Rake Finger System Design for the System Performance Improvement in the Mobile Communications (시스템 효율향상을 위한 이동통신망 Rake Finger 시스템 설계에 관한 연구)

  • Lee Seon-Keun;Lim Soon-Ja
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.1A
    • /
    • pp.31-36
    • /
    • 2004
  • In this paper, we proposed the new structure of the Rake Finger using Walsh Switch, the shared accumulator, and the pipeline-FWHT algorithm for reducing the signal processing complexity resulting from the increase of the number of data correlator. The function simulation of the proposed architecture is performed by Synopsys tool and the timing simulation is performed by Compass tool. The number of computational operation in the proposed data correlators is 160 additions and the conventional ones is 512 additions when the number of walsh code N=4. As a result, it is reduced about 3.2 times other than the number of computational operation of the conventional ones. Also, the result shows that the data processing time of the proposed Rake Finger architecture is 90,496[ns] and the conventional ones is 110,696[ns]. It is $18.3\%$ faster than the data processing time of the conventional Rake Finger architecture.

Motion Estimation and Compensation based on Advanced DCT (변환 영역에서 개선된 DCT를 기반으로 한 움직임 예측 및 보상)

  • Jang, Young;Cho, Hyo-Moon;Cho, Sang-Bock
    • Proceedings of the KIEE Conference
    • /
    • 2007.04a
    • /
    • pp.38-40
    • /
    • 2007
  • In this paper, we propose a novel architecture, which is based on DCT (Discrete Cosine Transform), for ME (Motion Estimation) and MC (Motion Compensation). The traditional algorithms of ME and MC based on DCT did not suffer the advantage of the coarseness of the 2-dimensional DCT (2-D DCT) coefficients to reduce the operational time. Therefore, we derive a recursion equation for transform-domain ME and MC and design the structure by using highly regular, parallel, and pipeline processing elements. The main difference with others is removing the IDCT block by using to transform domain. Therefore, the performance of our algorithm is more efficient in practical image processing such as DVR (Digital Video Recorder) system. We present the simulation result which is compare with the spatial domain methods. it shows reducing the calculation cost. compression ratio. and peak signal to noise ratio (PSNR).

  • PDF

An Implementation of High Speed Rendering to Process Touch Screen Multiple Inputs based on FPGA (FPGA 기반의 터치스크린 다중입력처리를 위한 고속 렌더링 구현)

  • Yoon, Junhan;Kim, Jin Heon
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.11
    • /
    • pp.1803-1810
    • /
    • 2017
  • A large amount of processing time is required if the process of detecting the touch position on the touch screen and displaying it on the display panel is performed only by software. In this paper, we propose a method to output information touched on the screen using H/W method in order to improve the response speed delay. In the FPGA module designed for the HDMI signal output to the display module, the touch information is input to the serial data signal including touch coordinate information, point size, and color information. Then the module render the image using HDMI signal input to the module and the touch information. This method has a pipeline structure so it has effect of reducing the delay time that occurs in outputting the touch information compared with the conventional software processing method.