2-D DCT/IDCT Processor Design Reducing Adders in DA Architecture

DA구조 이용 가산기 수를 감소한 2-D DCT/IDCT 프로세서 설계

  • Jeong Dong-Yun (Magnachip Semiconductor, DDI 3Team) ;
  • Seo Hae-Jun (School of Electrical & Electronics Engineering, Chungbuk University) ;
  • Bae Hyeon-Deok (School of Electrical & Electronics Engineering, Chungbuk University) ;
  • Cho Tae-Won (School of Electrical & Electronics Engineering, Chungbuk University)
  • 정동윤 (매그나칩 반도체 DDI 3팀) ;
  • 서해준 (충북대학교 전기전자컴퓨터공학부) ;
  • 배현덕 (충북대학교 전기전자컴퓨터공학부) ;
  • 조태원 (충북대학교 전기전자컴퓨터공학부)
  • Published : 2006.03.01

Abstract

This paper presents 8x8 two dimensional DCT/IDCT processor of adder-based distributed arithmetic architecture without applying ROM units in conventional memories. To reduce hardware cost in the coefficient matrix of DCT and IDCT, an odd part of the coefficient matrix was shared. The proposed architecture uses only 29 adders to compute coefficient operation in the 2-D DCT/IDCT processor, while 1-D DCT processor consists of 18 adders to compute coefficient operation. This architecture reduced 48.6% more than the number of adders in 8x8 1-D DCT NEDA architecture. Also, this paper proposed a form of new transpose network which is different from the conventional transpose memory block. The proposed transpose network block uses 64 registers with reduction of 18% more than the number of transistors in conventional memory architecture. Also, to improve throughput, eight input data receive eight pixels in every clock cycle and accordingly eight pixels are produced at the outputs.

본 논문은 가산기 기반 DA(Distributed Arithmetic: 분산 산술연산)구조로서 ROM과 같은 일반적인 메모리가 사용되지 않는 8x8의 2차원 DCT(Discrete Cosine Transform)/IDCT(Inverse DCT) 프로세서를 제안 설계하였다. 제안된 논문은 DCT와 IDCT의 계수 행렬에서 하드웨어를 줄이기 위해 계수 행렬의 홀수 부분을 공유하였고, 2차원 DCT/IDCT 프로세서의 계수 연산을 위해 단지 29개의 가산기만을 사용하였다. 이는 8x8 1차원 DCT NEDA(NEw DA)구조에서의 가산기 수 보다 48.6%를 감소 시켰다. 또한, 기존의 전치메모리와는 다른 새로운 전치네트워크 구조를 제안하였다. 제안된 전치네트워크 구조에서는 전치메모리 블록 대신 하드웨어를 줄이기 위해 레지스터 형태의 새로운 레지스터 블록 전치네트워크 형태를 제안하였다. 제안된 전치네트워크 블록은 64개의 레지스터를 사용하며, 이는 일반적인 메모리를 사용하는 기존의 전치메모리 구조에 사용된 트랜지스터 수 보다 18%가 감소하였다. 또한 처리율 향상을 위해 새롭게 적용되고 있는 방식으로, 입력 데이터에 대해 매 클럭 주기마다 8개의 화소데이터를 받아서 8개의 화소데이터를 처리하도록 하여 출력하는 비트 병렬화 구조로 설계하였다.

Keywords

References

  1. S.B.Pan, R.H,Park, 'Two-dimentsional systolic arrays for DCT/DST/DHT hardware implementation', Journal of the Korean Institute of Telematics and Electronics B, Vol.31B, No.10, pp.11-20, 1994
  2. H. J. Chung, S. J. Kim, G. H. Jung, Y. D. Kim, 'A DCT algorithm using shift and shift and addition', KICS, VoL18, No.6, pp.773-778, June, 1993
  3. H. S. Lim, D. S. Kim, N. I. Cho, and S. U. Lee, 'Systolic Arrays for 2-D DCT and Other Orthogonal Transforms.' KITE, Vol.27, No.7, 1990
  4. S.W.Lee, K.B.Yim, H.J.Chung, G.H.Jung, Y.D.Kim, 'A Architecture for the DCT and IDCT using a Fast DCT Algorithm', Journal of the Korean Institute of Telematics and Electronics B, Vol.31B, No.3, pp.11-20, 1994
  5. Shams, A., Bayoumi, M.,'A 108 Gbps, 1.5 GHz ID- DCT architecture,' Application-Specific Systems, Architectures, and Processors, 2000. Proceedings. IEEE International Conference on, 2000, Page(s):l63 -172
  6. Wendl Pan, Shams, A., Bayoumi, M.A.,'NEDA: a new distributed arithmetic architecture and its application to one dimensional discrete cosine transform Signal Processing Systems,' 1999. SIPS 99. 1999 IEEE Workshop on, 1999, Page(s):l59-168 https://doi.org/10.1109/SIPS.1999.822321
  7. Shams, A, Wendi Pan, Chandanandan, A,Bayoumi, M.,'A high-performance 1D-DCT architecture,' Circuits and Systems, 2000. Proceedings. ISCAS 2000 Geneva 2000 IEEE International Symposium on Volume: 5, 2000, Page(s):521-524. vol.5 https://doi.org/10.1109/ISCAS.2000.857486
  8. Kim, C.S., Song, S.W., Kim, M.Y., Han, Y.T. Kang, S.A., Lee, B.W. '200 mega pixel rate IDCT processor for HDTV applications,' Circuits and Systems, 1993., ISCAS '93, 1993 IEEE International Symposium on, May 1999, Page(s):2003-2006 vol.3
  9. T. S. Chang, C.S. Kung, C. W. Jen, 'New distri-buted arithmetic algorithm and its application to IDCT,' Circuits and Systems for Video Techno-logy, IEEE Transactions on circuit device System vol.146 No.4 Aug.1999 https://doi.org/10.1049/ip-cds:19990537
  10. B.M.Kim, H.D.Bae, T.W.Cho, 'A Design of high throughput IDCT processor in Distrited Arithmetic Method', Journal of the Institute of Electronics Engineers of Korea SC, Vol.40, No.6, pp.48-57, 2003
  11. Jun Rim Choi, Won Jun Hur, Kyoung Keun Lee, Ae Shin Kim, 'A 400 MPixel/s IDCT for HDTV by multibit coding and group symmetry,' Solid-State Circuits Conference, 1997. Digest of Technical Papers. 43rd ISSCC., 1997 IEEE International, 1997, Page(s):262-263, 470 https://doi.org/10.1109/ISSCC.1997.585380
  12. Yung-Pin Lee, Thou-He Chen, Liang-Gee Chen, Mei-Juan Chen, Chung-Wei Ku, 'A cost effective architecture for 8/spl times/8 two-dimensional DCT/IDCT using direct method, Circuits and Systems for Video Technology,' IEEE Transactions on Volume: 7. 3, June 1997, Page(s):459-467
  13. R. Ranbaldi, A. Ugazzoni, and R. Guerrieri, 'A 35uW 1.1V gate array 8x8 IDCT processor for video-telephony,' Proc. IEEE ICASSP, vol. 5, pp.2993-2996, 1998 https://doi.org/10.1109/ICASSP.1998.678155