• Title/Summary/Keyword: Gate count

Search Result 173, Processing Time 0.213 seconds

A Design of High Performance Motion Estimation Hardware for H.264/AVC (H.264/AVC를 위한 고성능 움직임 예측 하드웨어 설계)

  • Park, Seungyong;Ryoo, Kwangki
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.1
    • /
    • pp.124-130
    • /
    • 2013
  • In this paper, a new motion estimation algorithm with low-computational complexity is proposed to improve the performance of H.264/AVC. The proposed architecture uses the directions of the median motion vector which is computed by the motion vectors of the three neighbor macroblocks in Integer Motion Estimation. By using the directions of the vector, the proposed architecture has a single computational level instead of multi-computational levels in Integer Motion Estimation. The proposed motion estimation is synthesized using the TSMC 0.18um standard cell library. The synthesis result shows that the gate count is about 217.92K at 166MHz and it was improved about 69% compared with previous one.

Design of H.264 deblocking filter for the Low-Power Portable Multimedia (저전력 휴대용 멀티미디어를 위한 H.264 디블록킹 필터 설계)

  • Park, Sang Woo;Heo, Jeong Hwa;Park, Sang Bong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.8 no.4
    • /
    • pp.59-65
    • /
    • 2008
  • This paper proposed a H.264 deblocking filter for the portable low-power multimedia. In H.264 deblocking filter, total 8 input pixels in filtering operations needs own filtering operation process respectively, and each filtering process has common structures for each filtering operation. By sharing common filter coefficients and registers, we have designed and implemented an smaller gated module, and moreover filtering operations are skipped on some or whole pixels what if we use some specific condition to operate filtering modules that need lots of operations. In the core of filtering modules, we achieve 33.31% and 10.85% gate count reduction compared with those of filtering modules of the conventional deblocking filter papers. The proposed low-power deblocking filter is implemented by using samsung 0.35um standard cell library technology, the maximum operationh frequency is 108MHz, and the maximum throughput is 33.03 frames/s with CCIR601 image format.

  • PDF

Low-power Structure for H.264 Deblocking Filter (H.264용 디블로킹 필터의 저전력 구조)

  • Jang Young-Beom;Oh Se-Man;Park Jin-Su;Han Kyu-Hoon;Kim Soo-Hong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.3 s.309
    • /
    • pp.92-99
    • /
    • 2006
  • In this paper, a low-power deblocking filter structure for H.264 video coding algorithm is proposed. By sharing addition hardware for common filter coefficients, we have designed an efficient deblocking filter structure. Proposed deblocking filter utilizes MUX and DEMUX circuits for input data sharing and shows 44.2% reduction for add operation. In the HDL coding simulation and FPGA implementation, we achieved 19.5% and 19.4% gate count reduction, respectively, comparison with the conventional deblocking filter structure. Due to its efficient processing scheme, the proposed structure can be widely used in H.264 encoding and decoding SoC.

A Hardware Design of Effective Intra Prediction Angular Mode Decision for HEVC Encoder (HEVC 부호기를 위한 효율적인 화면내 예측 Angular 모드 결정 하드웨어 설계)

  • Park, Seungyong;Choi, Juyong;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.4
    • /
    • pp.767-773
    • /
    • 2017
  • In this paper, we propose a design of Intra prediction angular mode decision for HEVC encoder. Intra prediction coding of HEVC is a method for predicting a current block by referring to samples reconstructed around a current block. Intra prediction supports a total of 35 modes with 1 DC mode, 1 Planar mode, and 33 Angular modes. Intra prediction coding of HEVC works by performing all 35 modes for efficient encoding. However, in order to process all of the 35 modes, the computational complexity and operational time required are high. Therefore, this paper proposes comparing the difference in the value of the original pixel, using an algorithm that determines angular mode efficiently. This new algorithm reduces the Hardware size. The hardware which is proposed was designed using Verilog HDL and was implemented in 65nm technology. Its gate count is 14.9K and operating speed is 2GHz.

Design of High Speed Binary Arithmetic Encoder for CABAC Encoder (CABAC 부호화기를 위한 고속 이진 산술 부호화기의 설계)

  • Park, Seungyong;Jo, Hyungu;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.4
    • /
    • pp.774-780
    • /
    • 2017
  • This paper proposes an efficient binary arithmetic encoder hardware architecture for CABAC encoding, which is an entropy coding method of HEVC. CABAC is an entropy coding method that is used in HEVC standard. Entropy coding removes statistical redundancy and supports a high compression ratio of images. However, the binary arithmetic encoder causes a delay in real time processing and parallel processing is difficult because of the high dependency between data. The operation of the proposed CABAC BAE hardware structure is to separate the renormalization and process the conventional iterative algorithm in parallel. The new scheme was designed as a four-stage pipeline structure that can reduce critical path optimally. The proposed CABAC BAE hardware architecture was designed with Verilog HDL and implemented in 65nm technology. Its gate count is 8.07K and maximum operating speed of 769MHz. It processes the four bin per clock cycle. Maximum processing speed increased by 26% from existing hardware architectures.

Hardware Design of High Performance CAVLC Encoder (H.264/AVC를 위한 고성능 CAVLC 부호화기 하드웨어 설계)

  • Lee, Yang-Bok;Ryoo, Kwang-Ki
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.3
    • /
    • pp.21-29
    • /
    • 2012
  • This paper presents optimized searching technique to improve the performance of H.264/AVC. By using the proposed forward and backward searching algorithm, redundant cycles of latency for data reordering can be removed. Furthermore, in order to reduce the total number of execution cycles of CAVLC encoder, early termination mode and two stage pipelined architecture are proposed. The experimental result shows that the proposed architecture needs only 36.0 cycles on average for each $16{\times}16$ macroblock encoding. The proposed architecture improves the performance by 57.8% than that of previous designs. The proposed CAVLC encoder was implemented using Verilog HDL and synthesized with Magnachip $0.18{\mu}m$ standard cell library. The synthesis result shows that the gate count is about 17K with 125Mhz clock frequency.

Design of Unified HEVC/VP9 4×4 Transform Block (HEVC/VP9 4×4 Transform 통합 블록 설계)

  • Jung, Seulkee;Lee, Seongsoo
    • Journal of IKEEE
    • /
    • v.19 no.3
    • /
    • pp.392-399
    • /
    • 2015
  • This paper proposes a unified $4{\times}4$ transform architecture for HEVC and VP9 codec to reduce hardware size. It performs HEVC $4{\times}4$ IDCT, HEVC $4{\times}4$ IDST, VP9 $4{\times}4$ IDCT, and VP9 $4{\times}4$ IADST in a unified hardware. HEVC $4{\times}4$ IDCT and VP9 $4{\times}4$ IDCT have same IDCT computation except for the scales of coefficients. Similarly, HEVC $4{\times}4$ IDST and VP9 $4{\times}4$ IADST have same IDST computation except for the scales of coefficients. Furthermore, IDCT and IDST have quite a lot of similarity, so they can share some hardwares in common. So the proposed hardware performs all 4 operations in a unified hardware, where each operation has its own multiplication coefficients with shared butterfly adders. The synthesized block in 0.18 um technology is 6,679 gates, and the gate count is reduced by 25.3% in comparison with conventional designs.

Design of Modified JTAG for Debuggers of RISC Processors (RISC 프로세서의 디버거를 위한 변형된 JTAG 설계)

  • Xu, Jingzhe;Park, Hyung-Bae;Jung, Seung-Pyo;Park, Ju-Sung
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.7
    • /
    • pp.65-75
    • /
    • 2011
  • As the technology of SoC design has been developed, the debugging is more and more important and users want a fast and reliable debugger. This paper deals with an implementation of the fast debugger which can reduce a debugging processing cycle by designing a modified JTAG suitable for a new RISC processor debugger. Designed JTAG is embedded to the OCD of Core-A and works with SW debugger. We confirmed the functions and reliability of the debugger. By comparing to the original JTAG system, the debugging processing cycle of the proposed JTAG is reduced at 8.5~72.2% by each debugging function. Further more, the gate count is reduced at 31.8%.

Efficient CAVLC Decoder VLSI Design for HD Images (HD급 영상을 효율적으로 복호하기 위한 CAVLC 복호화기 VLSI 설계)

  • Oh, Myung-Seok;Lee, Won-Jae;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.4 s.316
    • /
    • pp.51-59
    • /
    • 2007
  • In this paper, we propose an efficient hardware architecture for H.264/AVC CAVLC (Context-based Adaptive Variable Length Coding) decoding which used for baseline profile and extended profile. Previous CAVLC architectures are consisted of five step block and each block gets effective bits from Controller block and Accumulator. If large number of non-zero coefficients exist, process for getting effective bits has to iterates many times. In order to reduce this unnecessary process, we propose two techniques, which combine five steps into four steps and reduce process to get efficiency bit by skipping addition step. By adopting these two techniques, the required processing time was reduced about 26% compared with previous architectures. It was designed in a hardware description language and total logic gate count was 16.83k using 0.18um standard cell library.

Novel Reconfigurable Coprocessor for Communication Systems (통신 시스템을 위한 고성능 재구성 가능 코프로세서의 설계)

  • Jung Chul Yoon;Sunwoo Myung Hoon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.6 s.336
    • /
    • pp.39-48
    • /
    • 2005
  • This paper proposes a reconfigurable coprocessor for communication systems, which can perform high speed computations and various functions. The proposed reconfigurable coprocessor can easily implement communication operations, such as scrambling, interleaving, convolutional encoding, Viterbi decoding, FFT, etc. The proposed architecture has been modeled by VHDL and synthesized using the SEC 0.18$\mu$m standard cell library. The gate count is about 35,000 gates and the critical path is 3.84ns. The proposed coprocessor can reduced about $33\%$ for FFT operations and complex MAC, $37\%$ for Viterbi operations, and $48\%\~84\%$ for scrambling and convolutional encoding for the IEEE 802.11a WLAN standard compared with existing DSPs. The proposed coprocessor shows Performance improvements compared with existing DSP chips for communication algorithms.