The Motion Estimator Implementation with Efficient Structure for Full Search Algorithm of Variable Block Size

다양한 블록 크기의 전역 탐색 알고리즘을 위한 효율적인 구조를 갖는 움직임 추정기 설계

  • 황종희 (연세대학교, 전기전자공학과) ;
  • 최윤식 (연세대학교, 전기전자공학과)
  • Published : 2009.11.25

Abstract

The motion estimation in video encoding system occupies the biggest part. So, we require the motion estimator with efficient structure for real-time operation. And for motion estimator's implementation, it is desired to design hardware module of an exclusive use that perform the encoding process at high speed. This paper proposes motion estimation detection block(MED), 41 SADs(Sum of Absolute Difference) calculation block, minimum SAD calculation and motion vector generation block based on parallel processing. The parallel processing can reduce effectively the amount of the operation. The minimum SAD calculation and MED block uses the pre-computation technique for reducing switching activity of the input signal. It results in high-speed operation. The MED and 41 SADs calculation blocks are composed of adder tree which causes the problem of critical path. So, the structure of adder tree has changed the most commonly used ripple carry adder(RCA) with carry skip adder(CSA). It enables adder tree to operate at high speed. In addition, as we enabled to easily control key variables such as control signal of search range from the outside, the efficiency of hardware structure increased. Simulation and FPGA verification results show that the delay of MED block generating the critical path at the motion estimator is reduced about 19.89% than the conventional strukcture.

움직임 추정은 영상 부호화 시스템에서 큰 비중을 차지하는 부분으로, 실시간 동작을 위해서는 효율적인 구조를 필요로 한다. 따라서 H.264 전체 시스템을 위한 움직임 추정기 블록의 구현은 부호화 과정을 고속으로 수행할 수 있도록 별도의 전용 하드웨어 모듈로 설계하는 것이 바람직하다. 본 논문에서는 많은 연산량을 효율적으로 줄일 수 있도록 병렬 처리를 바탕으로 움직임 추정 감지 블록, 41개의 SAD(Sum of Absolute Difference)값 계산 블록, 최소의 SAD값 계산과 움직임 벡터 생성 블록을 제안하고자 한다. 움직임 추정 감지 블록과 최소의 SAD값 계산기에서는 선계산(pre-computation) 방법을 적용함으로써, 입력 Switching Activity를 줄여 고속 구현이 가능하도록 하였으며, 움직임 추정 감지 블록과 41개의 SAD값 계산 블록에서 가장 많은 부분을 차지하는 가산기 구조를 일반적으로 사용되는 Ripple Carry Adder 대신에 Carry Skip Adder를 적용함으로써, Adder Tree 구조를 고속으로 처리할 수 있도록 하였다. 또한 외부에서 탐색 영역 제어와 같은 주요 변수를 쉽게 제어할 수 있도록 하여, 하드웨어 구조의 효율성을 높였다. 시뮬레이션 및 FPGA 검증 결과, 움직임 추정기의 임계 경로를 발생시키는 MED블록에서 일반적인 구조를 적용했을 때보다 19.89%의 Delay 감소 효과를 얻을 수 있었다.

Keywords

References

  1. Suk-Ju Kang, Dong-Gon Yoo, Sung-Kyu Lee, and Young Hwan Kim, "Hardware Implementation of Motion Estimation Using a Sub-sampled Block for Frame Rate Up-Conversion," International SoC Design Conference(ISOCC) 2008, pp. 101-104, Nov. 2008 https://doi.org/10.1109/SOCDC.2008.4815694
  2. Chun-Ho Cheung and Lai-Man Po, 'A Novel Cross-Diamond Search Algorithm for Fast Block Motion Estimation,' IEEE Trans. Circuit and Systems for video technology, Vol. 12, no. 12, December 2002 https://doi.org/10.1109/TCSVT.2002.806815
  3. B. M. Wang, J. C. Yen and S. Chang, "Zero wmtmg cycle hierarchical block matching algorithm and its array architectures," IEEE Trans. Circuit and Systems for video technology, Video Technology, Vol. 4, pp. 18-28, Feb. 1994 https://doi.org/10.1109/76.276169
  4. R. Srinivasan and K. R. Rao, "Predictive coding based on efficient motion estimation," IEEE Trans. Commun., Vol. COM-33, pp. 888-896, Aug. 1985
  5. Xiangwen Wang, Jun Sun, Rong Xie, Songyu Yu, and Wenjun Zhang, "An improved block size selection method based on macroblock movement characteristic," multimedia tools and applications, Vol. 43, no. 2, pp. 131-143, May 2009 https://doi.org/10.1007/s11042-009-0260-5
  6. Zhou Z, Sun MT and Hsu YF, "Fast variable block-size motion estimation algorithms based on merge and split procedures for H.264/MPEG-4 AVC," ISCAS2004, Vol. 3, pp. 725-728, May 2004
  7. Siou-Shen Lin, Po-Chih Tseng and Liang-Gee Chen, "Low-power Parallel Tree Architecture For Full Search Block-Matching Motion Estimation," ISCAS2004, Vol. 2, pp. 313-316, May 2004
  8. Subama Chatterjee and Amlan Chakrabarti, "Parallel Hardware Design for Motion Estimation," ACEEE 2009 Academy Publisher, International Journal of Recent Trends in Engineering, Vol. 1, no. 1, pp. 653-657, May 2009
  9. 윤미선, 장승호, 문동선, 신현철, 'H.264 동영상 압축에서의 가변 블록과 다중 프레임을 지원하는 효율적인 움직임 추정 방법,' 전자공학회 논문지, 제44권 SD편, 제5호, 58-64쪽, 2007년 5월
  10. Swee Yeow Yap and John V. McCanny, "A VLSI Architecture for Advanced Video Coding Motion Estimation," ASAP/03, Proceedings. IEEE International Conference on, pp. 293-301, June 2003
  11. Ching-Yeh Chen et al., "Analysis and architecture design of variable block size motion estimation for H.264/ A VC," IEEE Trans. Circuit and Systems for video technology, Reg. Paper, Vol. 53, no. 2, pp. 578-593, Feb. 2006 https://doi.org/10.1109/TCSI.2005.858488
  12. Jen-Chieh Tuan, Tian-Sheuan Chang, and Chein-Wei Jen, 'On the Data Reuse and Memory Bandwidth Analysis for Full-Search Block-Matching VLSI Architecture,' IEEE Trans. Circuit and Systems for video technology, Vol. 12, no. 1, pp. 61-71, January 2002 https://doi.org/10.1109/76.981846
  13. R. Zimmermann and H. Kaeslin, "Cell-Based multilevel Carry-Increment Adders with Minimal A T- and PT-Products," unpublished manuscript, http://www.iis.ee.ethz.ch/-zimmi/
  14. A. Amin, "High-Speed Self-Timed Carry-Skip Adder," Institution Engineering TechnologylET, lEE Proceedings - Circuit Devices and Systems, pp. 574-582, Vol. 153, no. 6, December 2006 https://doi.org/10.1049/ip-cds:20045195
  15. 장영범, 오세만, 김비철, 유현중, 'H.264 움직임 추정을 위한 효율적인 SAD 프로세서,' 전자공학회논문지, 제 44권 SP편, 제2호, 74-81쪽, 2007년 3월
  16. http://www.semiconductor.philips.com/buses/i2c