Common sub-expression sharing을 이용한 고속/저전력 DCT 구조

Low-power/high-speed DCT structure using common sub-expression sharing

  • 발행 : 2004.01.01

초록

이 논문에서는 곱셈기를 사용하지 않고 덧셈기 만을 사용하여 DCT를 효과적으로 수행하는 저전력 구조를 제안하였다. 고속처리가 가능하면서도 구현 하드웨어의 크기를 최소화하기 위하여 8-point DCT를 4 cycle에 수행하는 구조를 사용하였다. 즉, 첫 번째 cycle에서 사용한 계수용 하드웨어를 두 번째부터 네 번째까지의 계산에서도 공통으로 사용할 수 있는 구조를 채택하였다. 덧셈기 만을 사용하는 기존의 구조들은 CSD(Canonic signed digit)형의 계수를 사용하여 덧셈의 수를 줄이고 있다. 본 논문에서는 Common subexpression sharing 방식을 채용함으로서 하드웨어를 더욱 감소시킬 수 있는 구조를 제안하였다. 그 결과 8-point DCT의 경우에 CSD 만을 사용한 구조와 비교하여 19.5%의 덧셈 수 감소 효과를 달성하였다.

In this paper, a low-power 8-point DCT structure is proposed using add and shift operations. Proposed structure adopts 4 cycles for complete 8-point DCT in order to minimize size of hardware and to enable high-speed processing. In the structure, hardware for the first cycle can be shared in the next 3 cycles since all columns in the DCT coefficient matrix are common except sign. Conventional DCT structures implemented with only add and shift operation use CSD(Canonic Signed Digit) form coefficients to reduce the number of adders. To reduce the number of adders further, we propose a new structure using common sub-expression sharing techniques. With this techniques, the proposed 8-point DCT structure achieves 19.5% adder reduction comparison to the conventional structure using only CSD coefficient form.

키워드

참고문헌

  1. N. Ahmed, T. Natarajan, and K. R. Rao,'Discrete cosine transform,' IEEE Trans.Comput., vol. C-23, pp. 90-93, Jan. 1974 https://doi.org/10.1109/T-C.1974.223784
  2. W. H. Chen, C. H. Smith, and S. C. Fralick,'A fast computational algorithm for the discretecosine tiansform,' IEEE Trans. Commun., vol.C0M-25, pp. 1004-1009, Sep. 1977 https://doi.org/10.1109/TCOM.1977.1093941
  3. B. G. Lee, 'A new algorithm to comute thediscrete cosine transform,' IEEE Trans. Acoust.,Speech, Signal Processing, vol. ASSP-32, pp.1243-1245, Dec. 1984 https://doi.org/10.1109/TASSP.1984.1164443
  4. M. Vetterli, and H. J. Nussbaumer, 'SimpleFFT and DCT algorithm with reduced numberof operations,' Signal Process., vol. 6, No. 4,pp. 267-278, 1984 https://doi.org/10.1016/0165-1684(84)90059-8
  5. M. T. Sun, L. Wu, and M. L. Liou, 'Aconcurrent architecture for VLSI implementationof discrete cosine transform.' IEEE Trans.Circuits and Systems, vol. CAS-34, pp. 992-994,Aug. 1987 https://doi.org/10.1109/TCS.1987.1086215
  6. M. Kovac and N. Ranganathan, 'JAGUAR: AVLSI architecture for JPEG image compressionstandard,' Proc. IEEE, vol. 83, pp. 247-258,Feb. 1995 https://doi.org/10.1109/5.364464
  7. M. Yoshida, H. Ohtomo, and I. Kuroda, 'Anew generation 16-bit general purposeprogrammable DSP and its video rateapplication,' in IEEE Workshop on VLSI SignalProcessing, pp. 93-101, 1993
  8. J. Golston, 'Single-chip H.324 videoconferencing,'IEEE Micro., vol. 16, pp. 21-33, Aug. 1996
  9. T. S. Chang, C. S. Kung, and C. W. Jen, 'Asimple processor core design for DCT/IDCT,' IEEE Trans. Circuits cmd Systems for VideoTechnology, vol. 10, No. 3, Apr. 2000
  10. R. W. Reitwiesner, 'Binary arithmetic,' inAdvances in Computers, New York: Academic,vol. 1, pp. 231-308, 1966
  11. K. Hwang, Computer Arithmetic: Principles,Architecture, and Design, New York: Wiley, 1979
  12. R. I. Hartley, 'Subexpression sharing in filtersusing canonic signed digit multipliers,' IEEETrans. Circuits and Systems-II: Analog andDigital Signal Processing, vol. 43, No. 10, pp.677-688, Oct. 1996 https://doi.org/10.1109/82.539000
  13. M. Yagyu, A. Nishihara, and N. Fujii, 'FastFIR digital filter structures using minimalnumber of adders and its application to filterdesign,' IEICE Trans. Fundamentals of Electronics Communtications & ComputerSciences, vol. E79-A No. 8, pp. 1120-1129, Aug.1996