• Title/Summary/Keyword: pipelined architecture

Search Result 176, Processing Time 0.024 seconds

Design of High Performance Dual Channel Pipelined Interpolators for H.264 Decoder (이중 채널 파이프라인 구조의 H.264용 고성능 보간 연산기 설계)

  • Lee, Chan-Ho
    • Journal of IKEEE
    • /
    • v.13 no.4
    • /
    • pp.110-115
    • /
    • 2009
  • The motion compensation is the most time-consuming and complex unit in the H.264 decoder. The performance of the motion compensation is determined by the calculation of pixel interpolation. The quarter-pixel interpolation is achieved using 6-tap horizontal or vertical FIR filters for luminance data and bilinear FIR filters for chroma data. We propose the architecture for interpolation of luminance and chroma data in H.264 decoders. It is composed of dual-channel pipelined processing elements and can interpolate integer-, half- and quarter-pixel data. The number of the processing cycles is different depending on the position. The processing elements are composed of adders and shifters to reduce the complexity while the accuracy of the pixel data are maintained. We design interpolators for luminance and chroma data using Verilog-HDL and verify the function and performance by implementing using an FPGA.

  • PDF

A High-Speed CMOS A/D Converter Using an Acquistition-Time Minimization Technique) (정착시간 최소화 기법을 적용한 고속 CMOS A/D 변환기 설계)

  • 전병열;전영득;이승훈
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.5
    • /
    • pp.57-66
    • /
    • 1999
  • This paper describes a 12b, 50 Msample/s CMOS AID converter using an acquisition-time minimization technique for the high-speed sampling rate of 50 MHz level. The proposed ADC is implemented in a $0.35\mu\textrm{m}$ double-poly five-metal n-well CMOS technology and adopts a typical multi-step pipelined architecture to optimize sampling rate, resolution, and chip area. The speed limitation of conventional pipelined ADCs comes from the finite bandwidth and resulting speed of residue amplifiers. The proposed acquisition-time minimization technique reduces the acquisition time of residue amplifiers and makes the waveform of amplifier outputs smooth by controlling the operating current of residue amplifiers. The simulated power consumption of the proposed ADC is 197 mW at 3 V with a 50 MHz sampling rate. The chip size including pads is $3.2mm\times3.6mm$.

  • PDF

REGO: REconfiGurable system emulatOr (레고 : 재구성 가능한 시스템 에뮬레이터)

  • Kim, Nam-Do;Yang, Se-Yang
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.2
    • /
    • pp.91-103
    • /
    • 2002
  • For massive FPGA based emulator, the interconnection architecture and the transmission method of signals between FPGA's are important elements which decide speed of emulation and extendability of emulator. Existing FPGA-based emulation system is faced the problems of which the emulation speed getting slow drastically as the complexity of circuit increases. In this paper, we proposed a new innovative emulation architecture that has high resource usage rate and makes the fast emulation Possible. The emulator with very unique hierarchical ring topology Presented here has merits to overcome FPGA pin limitation by connecting each FPGA into a set of pipelined rings, and it also makes emulation speed at the tens of MHz at least even at system level where the verification complexity can easily exceed the verification capability of designers.

High-throughput and low-area implementation of orthogonal matching pursuit algorithm for compressive sensing reconstruction

  • Nguyen, Vu Quan;Son, Woo Hyun;Parfieniuk, Marek;Trung, Luong Tran Nhat;Park, Sang Yoon
    • ETRI Journal
    • /
    • v.42 no.3
    • /
    • pp.376-387
    • /
    • 2020
  • Massive computation of the reconstruction algorithm for compressive sensing (CS) has been a major concern for its real-time application. In this paper, we propose a novel high-speed architecture for the orthogonal matching pursuit (OMP) algorithm, which is the most frequently used to reconstruct compressively sensed signals. The proposed design offers a very high throughput and includes an innovative pipeline architecture and scheduling algorithm. Least-squares problem solving, which requires a huge amount of computations in the OMP, is implemented by using systolic arrays with four new processing elements. In addition, a distributed-arithmetic-based circuit for matrix multiplication is proposed to counterbalance the area overhead caused by the multi-stage pipelining. The results of logic synthesis show that the proposed design reconstructs signals nearly 19 times faster while occupying an only 1.06 times larger area than the existing designs for N = 256, M = 64, and m = 16, where N is the number of the original samples, M is the length of the measurement vector, and m is the sparsity level of the signal.

Design of an Area-Efficient Survivor Path Unit for Viterbi Decoder Supporting Punctured Codes (천공 부호를 지원하는 Viterbi 복호기의 면적 효율적인 생존자 경로 계산기 설계)

  • Kim, Sik;Hwang, Sun-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.3A
    • /
    • pp.337-346
    • /
    • 2004
  • Punctured convolutional codes increase transmission efficiency without increasing hardware complexity. However, Viterbi decoder supporting punctured codes requires long decoding length and large survivor memory to achieve sifficiently low bit error rate (BER), when compared to the Viterbi decoder for a rate 1/2 convolutional code. This Paper presents novel architecture adopting a pipelined trace-forward unit reducing survivor memory requirements in the Viterbi decoder. The proposed survivor path architecture reduces the memory requirements by removing the initial decoding delay needed to perform trace-back operation and by accelerating the trace-forward process to identify the survivor path in the Viterbi decoder. Experimental results show that the area of survivor path unit has been reduced by 16% compared to that of conventional hybrid survivor path unit.

An Architecture for IEEE 802.11n LDPC Decoder Supporting Multi Block Lengths (다중 블록길이를 지원하는 IEEE 802.11n LDPC 복호기 구조)

  • Na, Young-Heon;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.798-801
    • /
    • 2010
  • This paper describes an efficient architecture for LDPC(Low-Density Parity Check) decoder, which supports three block lengths (648, 1,296, 1,944) of IEEE 802.11n standard. To minimize hardware complexity, the min-sum algorithm and block-serial layered structure are adopted in DFU(Decoding Function Unit) which is a main functional block in LDPC decoder. The optimized H-ROM structure for multi block lengths reduces the ROM size by 42% as compared to the conventional method. Also, pipelined memory read/write scheme for inter-layer DFU operations is proposed for an optimized operation of LDPC decoder.

  • PDF

A 2.5 V 10b 120 MSample/s CMOS Pipelined ADC with High SFDR (높은 SFDR을 갖는 2.5 V 10b 120 MSample/s CMOS 파이프라인 A/D 변환기)

  • Park, Jong-Bum;Yoo, Sang-Min;Yang, Hee-Suk;Jee, Yong;Lee, Seung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.39 no.4
    • /
    • pp.16-24
    • /
    • 2002
  • This work describes a 10b 120 MSample/s CMOS pipelined A/D converter(ADC) based on a merged-capacitor switching(MCS) technique for high signal processing speed and high resolution. The proposed ADC adopts a typical multi-step pipelined architecture to optimize sampling rate, resolution, and chip area, and employs a MCS technique which improves sampling rate and resolution reducing the number of unit capacitor used in the multiplying digital-to-analog converter (MDAC). The proposed ADC is designed and implemented in a 0.25 um double-poly five-metal n-well CMOS technology. The measured differential and integral nonlinearities are within ${\pm}$0.40 LSB and ${\pm}$0.48 LSB, respectively. The prototype silicon exhibits the signal-to-noise-and-distortion ratio(SNDR) of 58 dB and 53 dB at 100 MSample/s and 120 MSample/s, respectively. The ADC maintains SNDR over 54 dB and the spurious-free dynamic range(SFDR) over 68 dB for input frequencies up to the Nyquist frequency at 100 MSample/s. The active chip area is 3.6 $mm^2$(= 1.8 mm ${\times}$ 2.0 mm) and the chip consumes 208 mW at 120 MSample/s.

A Single-Chip Design of Two-Dimensional Digital Riler with CSD Coefficients (CSD 계수에 의한 이차원 디지탈필터의 단일칩설계)

  • 문종억;송낙운;김창민
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.1
    • /
    • pp.241-250
    • /
    • 1996
  • In this work, an improved architecture of two-dimensional digital filter(2D DF) is suggested, and then the filter is simulated by C, VHDL language and related layouts are designed by Berkeley CAD tools. The 2D DF consists of one-dimensional digital filters and delay lines. For one-dimensional digital filter(1D DF) case, once filter coefficients are represented by canonical signed digit formats, multiplications are exected by hardwired-shifting methods. The related bit numbers are handled to prevent picture quality degradation and pipelined adder architectures are adopted in each tap and output stage to speed up the filter. For delay line case, line-sharing DRAM is adopted to improve power dissipation and speed. The filter layout is designed by semi/full custom methods considering regularity and speed improvement, and normal operation is confirmed by simulation.

  • PDF

Design and Implementation of a High Speed Pager Based on FLEX Protocol (FLEX 방식 고속 무선호출 단말기 설계 및 구현)

  • 오병문;이동원;김영철
    • Proceedings of the IEEK Conference
    • /
    • 2000.11a
    • /
    • pp.205-208
    • /
    • 2000
  • In this paper, we have designed a pager based on the FLEX protocol. The pager consists of a decoder, a MCU, a SPI, and a User interface. The decoder contains the following blocks: synchronizer, de-interleaver, error corrector, packet builder. The decoded data is converted to SPI packets for communication between the MCU and the FLEX decoder. The host MCU is a RISC pipelined architecture, so it processes data at high speed and also sends messages to user interface. We have designed the proposed pager as structural modeling using VHDL language. Then, We simulated and synthesized it using tool of SYNOPSYS corporation.

  • PDF

The CORDIC Circuit of Redundant Signed Binary Number (Redundant Signed Binary Number에 의한 CORDIC 회로)

  • 김승열;김용대;한선경;유영갑
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.40 no.6
    • /
    • pp.1-8
    • /
    • 2003
  • A novel CORDIC circuit is presented based on a redundant number system eliminating global carry Propagation. The number format employs a new recoding scheme similar to the Booth receding resolving carry problems in addition. A pipelined architecture is introduced having a constant scale factor in its computation of trigonometric functions. The operational time of the circuit is constant independent of the number of operand digits.