• Title/Summary/Keyword: inverse quantization

Search Result 49, Processing Time 0.027 seconds

Implementation of an AMBA-Based IP for H.264 Transform and Quantization (H.264 변환 및 양자화 기능을 갖는 AMBA 기반 IP 구현)

  • Lee, Seon-Young;Cho, Kyeong-Soon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.10 s.352
    • /
    • pp.126-133
    • /
    • 2006
  • This paper describes an AMBA-based IP to perform forward and inverse transform and quantization required in the H.264 video compression standard. The transform and quantization circuit was optimized for area and performance. The AHB wrapper was added to the circuit for the AMBA-based operation. The user of the IP can specify how long the bus may be occupied by the IP and also where the video data are stored in the external memory. The function of the proposed IP based on AMBA Specification was verified on the platform board with Xilinx FPGA and ARM9 processor. We fabricated an MPW chip using $0.25{\mu}m$ standard cells and observed its correct operations on silicon.

Hardware Implementation of Integer Transform and Quantization for H.264 (하드웨어 기반의 H.264 정수 변환 및 양자화 구현)

  • 임영훈;정용진
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.12C
    • /
    • pp.1182-1191
    • /
    • 2003
  • In this paper, we propose a new hardware architecture for integer transform, quantizer, inverse quantizer, and inverse integer transform of a new video coding standard H.264/JVT. We describe the algorithm and derive hardware architecture emphasizing the importance of area for low cost and low power consumption. The proposed architecture has been verified by PCI-interfaced emulation board using APEX-II Alters FPGA and also by ASIC synthesis using Samsung 0.18 um CMOS cell library. The ASIC synthesis result shows that the proposed hardware can operate at 100 MHz, processing more than 1,300 QCIF video frames per second. The hardware is going to be used as a core module when implementing a complete H.264 video encoder/decoder ASIC for real-time multimedia application.

Application Specific Processor Design for H.264 Decoder with a Configurable Embedded Processor

  • Han, Jin-Ho;Lee, Mi-Young;Bae, Young-Hwan;Cho, Han-Jin
    • ETRI Journal
    • /
    • v.27 no.5
    • /
    • pp.491-496
    • /
    • 2005
  • An application specific processor for an H.264 decoder with a configurable embedded processor is designed in this research. The motion compensation, inverse integer transform, inverse quantization, and entropy decoding algorithm of H.264 decoder software are optimized. We improved the performance of the processor with instruction-level hardware optimization, which is tailored to configurable embedded processor architecture. The optimized instructions for video processing can be used in other video compression standards such as MPEG 1, 2, and 4. A significant performance improvement is achieved with high flexibility. Experimental results show that we could achieve 300% performance for the H.264 baseline profile level 2 decoder.

  • PDF

Design of Parallel Inverse Quantization and Inverse Transform Architecture for High Performance H.264/AVC Decoder (고성능 H.264/AVC 복호기를 위한 병렬 역양자화 및 역변환 구조 설계)

  • Jung, Hong-Kyun;Ryoo, Kwang-Ki
    • Proceedings of the KAIS Fall Conference
    • /
    • 2011.12b
    • /
    • pp.434-437
    • /
    • 2011
  • 본 논문에서는 H.264/AVC 복호기의 성능을 향상시키기 위해 병렬 역양자화 구조와 역변환 구조를 제안한다. 제안하는 역양자화 구조는 공통 연산기를 사용하여 계산 복잡도를 감소시키고, 4개의 공통연산기를 사용하여 역양자화 수행 사이클 수를 1 사이클로 감소시킨다. 제안하는 역변환 구조는 4개의 변환 연산기를 사용하여 역변환 연산을 수행하는데 2 사이클이 소요된다. 또한 제안하는 구조는 역양자화 연산과 수평 역변환 연산을 동시에 수행하는 병렬 구조를 채택하여 역양자화 및 역변환 수행 사이클 수를 2 사이클로 감소시킨다. 제안하는 구조를 Magnachip 0.18um CMOS 공정 라이브러리를 이용하여 합성한 결과 1.5MHz의 동작 주파수에서 게이트 수는 14,173이고, 표준 참조 소프트웨어 JM 9.4에서 추출한 데이터를 이용하여 성능을 측정한 결과 제안하는 구조의 수행 사이클 수가 기존 구조 대비 38.74% 향상되었다.

  • PDF

A Novel Reconfigurable Processor Using Dynamically Partitioned SIMD for Multimedia Applications

  • Lyuh, Chun-Gi;Suk, Jung-Hee;Chun, Ik-Jae;Roh, Tae-Moon
    • ETRI Journal
    • /
    • v.31 no.6
    • /
    • pp.709-716
    • /
    • 2009
  • In this paper, we propose a novel reconfigurable processor using dynamically partitioned single-instruction multiple-data (DP-SIMD) which is able to process multimedia data. The SIMD processor and parallel SIMD (P-SIMD) processor, which is composed of a number of SIMD processors, are usually used these days. But these processors are inefficient because all processing units (PUs) should process the same operations all the time. Moreover, the PUs can process different operations only when every SIMD group operation is predefined. We propose a processor control method which can partition parallel processors into multiple SIMD-based processors dynamically to enhance efficiency. For performance evaluation of the proposed method, we carried out the inverse transform, inverse quantization, and motion compensation operations of H.264 using processors based on SIMD, P-SIMD, and DP-SIMD. Experimental results show that the DP-SIMD control method is more efficient than SIMD and P-SIMD control methods by about 15% and 14%, respectively.

Implementation of IQ/IDCT in H.264/AVC Decoder Using GPGPU (GPGPU를 이용한 H.264/AVC 디코더)

  • Kim, Dong-Han;Lee, Kwang-Yeob
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.162-164
    • /
    • 2010
  • H.264/AVC(Advanced Video Coding) is a standard for video compression. H.264/AVC provides good video quality at substantially lower bit rates than previous standards. In this papers, we propose the efficient architecture of H.264/AVC decoder using GPGPU. GPGPU can process many of operation in parallel. IQ/IDCT is possible that parallel processing in H.264/AVC decoding algorithm.

  • PDF

Bitrate Scalable Video Coder (비트율 계위 비디오 부호기)

  • 임범렬;임성호;민병의;황승구;황재정
    • Journal of Broadcast Engineering
    • /
    • v.2 no.2
    • /
    • pp.206-215
    • /
    • 1997
  • We pror.a;e a H.263-based video ceder with two-layer ocalability. The bare layer is ceded by using the default H.263 axling algorithms to achieve high compresred video data and the enhanced layer is axied by enhanced axling such as HVS-based quantization updating. The enhanced layer contains only arled refinement data for the OCT coefficients of the bare layer. Bitstream syntax and semantics for enhancement layer are designed and quantizer design using the HVS is pror.ooed. Data from the two layers are combined after inverse quantization and inverse OCT prcx:esses in the deaxier. We show with e~rirrental results that the pror.a;ed layered arlee achieves comparable picture quality with non-layered arlee at bitrates of 30 kbr;s or less. Overhead information for the bitstream layer can 00 limited to less than 0.5 kbits/frame.

  • PDF

Parallel Inverse Transform and Small-sized Inverse Quantization Architectures Design of H.264/AVC Decoder (H.264/AVC 복호기의 병렬 역변환 구조 및 저면적 역양자화 구조 설계)

  • Jung, Hong-Kyun;Cha, Ki-Jong;Park, Seung-Yong;Kim, Jin-Young;Ryoo, Kwang-Ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.10a
    • /
    • pp.444-447
    • /
    • 2011
  • In this paper, parallel IT(inverse transform) architecture and IQ(inverse quantization) architecture with common operation unit for the H.264/AVC decoder are proposed. By using common operation unit, the area cost and computational complexity of IQ are reduced. In order to take four execution cycles to perform IT, the proposed IT architecture has parallel architecture with one horizontal DCT unit and four vertical DCT units. Furthermore, the execution cycles of the proposed architecture is reduced to five cycles by applying two state pipeline architecture. The proposed architecture is implemented to a single chip by using Magnachip 0.18um CMOS technology. The gate count of the proposed architecture is 14.3k at clock frequency of 13MHz and the area of proposed IQ is reduced 39.6% compared with the previous one. The experimental result shows that execution cycle the proposed architecture is about 49.09% higher than that of the previous one.

  • PDF

NDFT-based Image Steganographic Scheme with Discrimination of Tampers

  • Wang, Hongxia;Fan, Mingquan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.12
    • /
    • pp.2340-2354
    • /
    • 2011
  • A new and secure image steganographic scheme based on nonuniform discrete Fourier transform (NDFT) is proposed in this paper. First, the chaotic system is introduced to select embedding points randomly in NDFT domain suitable range, and NDFT is implemented on every non-overlapping block of eight consecutive pixels. Second, the secret messages are scrambled by chaotic systems, and embedded into frequency coefficients by quantization method. The stego-image is obtained by inverse NDFT (INDFT). Besides, in order to discriminate tampers, the low frequency wavelet coefficients of 7 most significant bits (MSBs) of the stego-image are converted into the binary sequence after nonuniform scalar quantization. Then the obtained binary sequence is scrambled by the chaotic systems, and embedded into the least significant bit (LSB) of the stego-image. Finally, the watermarked stego-image can be obtained by a new improved LSB steganographic method. The embedded secret messages can be extracted from the watermarked stego-image without the original cover image. Experimental results show the validity of the proposed scheme, and dual statistics attacks are also conducted to indicate the security.

An Efficient Architecture of Transform & Quantization Module in MPEG-4 Video Code (MPEG-4 영상코덱에서 DCTQ module의 효율적인 구조)

  • 서기범;윤동원
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.11
    • /
    • pp.29-36
    • /
    • 2003
  • In this paper, an efficient VLSI architecture for DCTQ module, which consists of 2D-DCT, quantization, AC/DC prediction block, scan conversion, inverse quantization and 2D-IDCT, is presented. The architecture of the module is designed to handle a macroblock data within 1064 cycles and suitable for MPEG-4 video codec handling 30 frame CIF image for both encoder and decoder simultaneously. Only single 1-D DCT/IDCT cores are used for the design instead of 2-D DCT/IDCT, respectively. 1-bit serial distributed arithmetic architecture is adopted for 1-D DCT/IDCT to reduce the hardware area in this architecture. To reduce the power consumption of DCTQ modu1e, we propose the method not to operate the DCTQ modu1e exploiting the SAE(sum of absolute error) value from motion estimation and cbp(coded block pattern). To reduce the AC/DC prediction memory size, the memory architecture and memory access method for AC/DC prediction block is proposed. As the result, the maximum utilization of hardware can be achieved, and power consumption can be minimized. The proposed design is operated on 27MHz clock. The experimental results show that the accuracy of DCT and IDCT meet the IEEE specification.