• Title/Summary/Keyword: Standard cell library

Search Result 196, Processing Time 0.025 seconds

Design of Luma and Chroma Sub-pixel Interpolator for H.264 Motion Estimation (H.264 움직임 예측을 위한 Luma와 Chroma 부화소 보간기 설계)

  • Lee, Seon-Young;Cho, Kyeong-Soon
    • The KIPS Transactions:PartA
    • /
    • v.18A no.6
    • /
    • pp.249-254
    • /
    • 2011
  • This paper describes an efficient design of the interpolation circuit to generate the luma and chroma sub-pixels for H.264 motion estimation. The circuit based on the proposed architecture does not require any input data buffering and processes the horizontal, vertical and diagonal sub-pixel interpolations in parallel. The performance of the circuit is further improved by simultaneously processing the 1/2-pixel and 1/4-pixel interpolations for luma components and the 1/8-pixel interpolations for chroma components. In order to reduce the circuit size, we store the intermediate data required to process all the interpolations in parallel in the internal SRAM's instead of registers. We described the proposed circuit at register transfer level and verified its operation on FPGA board. We also synthesized the gate-level circuit using 130nm CMOS standard cell library. It consists of 20,674 gates and has the maximum operating frequency of 244MHz. The total number of SPSRAM bits used in our circuit is 3,232. The size of our circuit (including logic gates and SRAM's) is smaller than others and the performance is still comparable to them.

Design of H.264 Deblocking Filter for Low-Power Mobile Multimedia SoCs (저전력 휴대 멀티미디어 SoC를 위한 H.264 디블록킹 필터 설계)

  • Koo Jae-Il;Lee Seongsoo
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.1 s.343
    • /
    • pp.79-84
    • /
    • 2006
  • This paper proposed a novel H.264 deblocking filter for low-power mobile multimedia SoCs. In H.264 deblocking filter, filtering can be skipped on some pixels when pixel value differences satisfy some specific conditions. Furthermore, whole filtering can be skipped when quantization parameter is less than 16. Based on these features, power consumption can be significantly reduced by shutting down deblocking filter partially or as a whole. The proposed deblocking filter can shut down partial or whole blocks with simple control circuits. Common hardware performs both horizontal filtering and vertical filtering. It was implemented in silicon chip using $0.35{\mu}m$ standard cell library technology. The gate count is about 20,000 gates. The maximum operation frequency is 108MHz. The maximum throughput is 30 frame/s with CCIR601 image format.

Efficient VLSI Architecture for Factorization in Soft-Decision Reed-Solomon List Decoding (연판정 Reed-Solomon 리스트 디코딩의 Factorization을 위한 효율적인 VLSI 구조)

  • Lee, Sung-Man;Park, Tae-Guen
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.11
    • /
    • pp.54-64
    • /
    • 2010
  • Reed-Solomon (RS) codes are the most widely used error correcting codes in digital communications and data storage. Recently, Sudan found algorithm of list decoder for RS codes. List decoder has larger decoding radius than conventional hard-decision decoding algorithms and return more than one candidate polynomial. But, the algorithm includes interpolation and factorization step that demand massive computations. In this paper, an efficient architecture and processing schedule are proposed. The architecture consists of R-MAC, memories, and control unit. The R-MAC computes both of RC and PU steps that are main part of the factorization algorithm. The proposed architecture can achieve higher hardware utilization efficiency (HUE) and throughput by using efficient processing schedule and memory architecture. Also, the architecture can be designed flexibly with scalability for various applications. We design and synthesize our architecture using Dongbu-Anam $0.18{\mu}m$ standard cell library and the maximum clock frequency is 330MHz.

Efficient Motion Estimation Algorithm and Circuit Architecture for H.264 Video CODEC (H.264 비디오 코덱을 위한 효율적인 움직임 추정 알고리즘과 회로 구조)

  • Lee, Seon-Young;Cho, Kyeong-Soon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.12
    • /
    • pp.48-54
    • /
    • 2010
  • This paper presents a high-performance architecture of integer-pel motion estimation circuit for H.264 video CODEC. Full search algorithm guarantees the best results by examining all candidate blocks. However, the full search algorithm requires a huge amount of computation and data. Many fast search algorithms have been proposed to reduce the computational efforts. The disadvantage of these algorithms is that data access from or to memory is very irregular and data reuse is difficult. In this paper, we propose an efficient integer-pixel motion estimation algorithm and the circuit architecture to improve the processing speed and reduce the external memory bandwidth. The proposed circuit supports seven kinds of variable block sizes and generates 41 motion vectors. We described the proposed high-performance motion estimation circuit at R1L and verified its operation on FPGA board. The circuit synthesized by using l30nm CMOS standard cell library processes 139.8 1080HD ($1,920{\times}1,088$) image frames per second and supports up to H.264 level 5.1.

Efficient systolic VLSI architecture for division in $GF(2^m)$ ($GF(2^m)$ 상에서의 나눗셈연산을 위한 효율적인 시스톨릭 VLSI 구조)

  • Kim, Ju-Young;Park, Tae-Geun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.3 s.357
    • /
    • pp.35-42
    • /
    • 2007
  • The finite-field division can be applied to the elliptic curve cryptosystems. However, an efficient algorithm and the hardware design are required since the finite-field division takes much time to compute. In this paper, we propose a radix-4 systolic divider on $GF(2^m)$ with comparative area and performance. The algorithm of the proposed divide, is mathematically developed and new counter structure is proposed to map on low-cost systolic cells, so that the proposed systolic architecture is suitable for YLSI design. Compared to the bit-parallel, bit-serial and digit-serial dividers, the proposed divider has relatively effective high performance and low cost. We design and synthesis $GF(2^{193})$ finite-field divider using Dongbuanam $0.18{\mu}m$ standard cell library and the maximum clock frequency is 400MHz.

A Design of High Performance Motion Estimation Hardware for H.264/AVC (H.264/AVC를 위한 고성능 움직임 예측 하드웨어 설계)

  • Park, Seungyong;Ryoo, Kwangki
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.1
    • /
    • pp.124-130
    • /
    • 2013
  • In this paper, a new motion estimation algorithm with low-computational complexity is proposed to improve the performance of H.264/AVC. The proposed architecture uses the directions of the median motion vector which is computed by the motion vectors of the three neighbor macroblocks in Integer Motion Estimation. By using the directions of the vector, the proposed architecture has a single computational level instead of multi-computational levels in Integer Motion Estimation. The proposed motion estimation is synthesized using the TSMC 0.18um standard cell library. The synthesis result shows that the gate count is about 217.92K at 166MHz and it was improved about 69% compared with previous one.

Implementation of 24bit Sigma-delta D/A Converter for an Audio (오디오용 24bit 시그마-델타 D/A 컨버터 구현)

  • Heo, Jeong-Hwa;Park, Sang-Bong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.8 no.4
    • /
    • pp.53-58
    • /
    • 2008
  • This paper designs sigma-delta D/A Converter with a high resolution and low power consumption. It reorganizes the input data along LJ, RJ, I2S mode and bit mode to the output data of A/D converter. The D/A converter decodes the original analog signal through HBF, Hold and 5th CIFB(Cascaded Integrators with distributed Feedback as well as distributed input coupling) sigma-delta modulation blocks. It uses repeatedly the addition operation in instead of the multiply operation for the chip area and the performance. Also, the half band filters of similar architecture composed the one block and it used the sample-hold block instead of the sinc filter. We supposed simple D/A Converter decreased in area. The filters of the block analyzed using the matlab tool. The top block designed using the top-down method by verilog language. The designed block is fabricated using Samsung 0.35um CMOS standard cell library. The chip area is 1500*1500um.

  • PDF

Design of H.264 deblocking filter for the Low-Power Portable Multimedia (저전력 휴대용 멀티미디어를 위한 H.264 디블록킹 필터 설계)

  • Park, Sang Woo;Heo, Jeong Hwa;Park, Sang Bong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.8 no.4
    • /
    • pp.59-65
    • /
    • 2008
  • This paper proposed a H.264 deblocking filter for the portable low-power multimedia. In H.264 deblocking filter, total 8 input pixels in filtering operations needs own filtering operation process respectively, and each filtering process has common structures for each filtering operation. By sharing common filter coefficients and registers, we have designed and implemented an smaller gated module, and moreover filtering operations are skipped on some or whole pixels what if we use some specific condition to operate filtering modules that need lots of operations. In the core of filtering modules, we achieve 33.31% and 10.85% gate count reduction compared with those of filtering modules of the conventional deblocking filter papers. The proposed low-power deblocking filter is implemented by using samsung 0.35um standard cell library technology, the maximum operationh frequency is 108MHz, and the maximum throughput is 33.03 frames/s with CCIR601 image format.

  • PDF

A design of LDPC decoder supporting multiple block lengths and code rates of IEEE 802.11n (다중 블록길이와 부호율을 지원하는 IEEE 802.11n용 LDPC 복호기 설계)

  • Kim, Eun-Suk;Park, Hae-Won;Na, Young-Heon;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.05a
    • /
    • pp.132-135
    • /
    • 2011
  • This paper describes a multi-mode LDPC decoder which supports three block lengths(648, 1296, 1944) and four code rates(1/2, 2/3, 3/4, 5/6) of IEEE 802.11n WLAN standard. To minimize hardware complexity, it adopts a block-serial (partially parallel) architecture based on the layered decoding scheme. A novel memory reduction technique devised using the min-sum decoding algorithm reduces the size of check-node memory by 47% as compared to conventional method. The designed LDPC decoder is verified by FPGA implementation, and synthesized with a $0.18-{\mu}m$ CMOS cell library. It has 219,100 gates and 45,036 bits RAM, and the estimated throughput is about 164~212 Mbps at 50 MHz@2.5v.

  • PDF

An implementation of block cipher algorithm HIGHT for mobile applications (모바일용 블록암호 알고리듬 HIGHT의 하드웨어 구현)

  • Park, Hae-Won;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.05a
    • /
    • pp.125-128
    • /
    • 2011
  • This paper describes an efficient hardware implementation of HIGHT block cipher algorithm, which was approved as standard of cryptographic algorithm by KATS(Korean Agency for Technology and Standards) and ISO/IEC. The HIGHT algorithm, which is suitable for ubiquitous computing devices such as a sensor in USN or a RFID tag, encrypts a 64-bit data block with a 128-bit cipher key to make a 64-bit cipher text, and vice versa. For area-efficient and low-power implementation, we optimize round transform block and key scheduler to share hardware resources for encryption and decryption. The HIGHT64 core synthesized using a $0.35-{\mu}m$ CMOS cell library consists of 3,226 gates, and the estimated throughput is 150-Mbps with 80-MHz@2.5-V clock.

  • PDF