Browse > Article

Simulation of YUV-Aware Instructions for High-Performance, Low-Power Embedded Video Processors  

Kim, Cheol-Hong (전남대학교 전자컴퓨터공학부)
Kim, Jong-Myon (울산대학교 컴퓨터정보통신공학부)
Abstract
With the rapid development of multimedia applications and wireless communication networks, consumer demand for video-over-wireless capability on mobile computing systems is growing rapidly. In this regard, this paper introduces YUV-aware instructions that enhance the performance and efficiency in the processing of color image and video. Traditional multimedia extensions (e.g., MMX, SSE, VIS, and AltiVec) depend solely on generic subword parallelism whereas the proposed YUV-aware instructions support parallel operations on two-packed 16-bit YUV (6-bit Y, 5-bits U, V) values in a 32-bit datapath architecture, providing greater concurrency and efficiency for color image and video processing. Moreover, the ability to reduce data format size reduces system cost. Experiment results on a representative dynamically scheduled embedded superscalar processor show that YUV-aware instructions achieve an average speedup of 3.9x over the baseline superscalar performance. This is in contrast to MMX (a representative Intel#s multimedia extension), which achieves a speedup of only 2.1x over the same baseline superscalar processor. In addition, YUV-aware instructions outperform MMX instructions in energy reduction (75.8% reduction with YUV-aware instructions, but only 54.8% reduction with MMX instructions over the baseline).
Keywords
YUV video data processing; Multimedia instructions; Embedded superscalar processors; High-performance video processors;
Citations & Related Records
연도 인용수 순위
  • Reference
1 V. Tiwari, S. Malik, and A. Wolfe, 'Compilation Techniques for Low Energy: An Overview,' in Proc. of the IEEE Intl. Symp. on Low Power Electron., pp. 38-39, Oct. 1994
2 J. Suh and V. K. Prasanna, 'An Efficient Algorithm for Out-of-core Matrix Transposition,' IEEE Trans. on Computers, Vol.51, No.4, pp. 420-438, April 2002   DOI   ScienceOn
3 D. Brooks, V. Tiwari, and M. Martonosi, 'Wattch: A framework for architectural-level power analysis and optimizations,' in Proc. of the IEEE Intl. Symp. on Computer Architecture, pp. 83-94, June 2000
4 N. Slingerland and A. J. Smith, 'Measuring the Performance of Multimedia Instruction Sets,' IEEE Trans. on Computers, Vol.51, No.11, pp. 1317-1332, Nov. 2002   DOI   ScienceOn
5 A. Peleg and U. Weiser, 'MMX Technology Extension to the Intel Architecture,' IEEE Micro, Vol.16, No.4, pp. 42-50, Aug. 1996
6 R. C. Gonzalez and R. E. Woods, Digital Image Processing, 2nd Ed., Prentice Hall, 2002
7 K. N. Plataniotis and A. N. Venetsanopoulos, Color Image Processing and Applications, Springer Verlag, 2000
8 D. Burger, T. M. Austin, and S. Bennett, 'Evaluating future micro-processors: the SimpleScalar tool set,' Tech. Report TR-1308, Univ. of Wisconsin-Madison Computer Sciences Dept., 1997
9 S. K. Raman, V. Pentkovski, and J. Keshava, 'Implementing Streaming SIMD Extensions on the Pentium III Processor,' IEEE Micro, Vol.20, No.4, pp. 28-39, 2000
10 M. Tremblay, J. M. O'Connor, V. Narayanan, and L. He, 'VIS Speeds New Media Processing,' IEEE Micro, Vol.16, No.4, pp. 10-20, Aug. 1996
11 H. Nguyen and L. John, 'Exploiting SIMD Parallelism in DSP and Multimedia Algorithms using the AltiVec Technology,' in Proc. Intl. Conf. on Supercomputer, pp. 11-20, June 1999
12 R. B. Lee, 'Subword Parallelism with MAX-2,' IEEE Micro, Vol.16, No.4, pp. 51-59, Aug. 1996