고속 디지털 신호처리를 위한 MBA기반 병렬 MAC의 효율적인 구조

A Efficient Architecture of MBA-based Parallel MAC for High-Speed Digital Signal Processing

  • 서영호 (광운대학교 Digital Design & Test Lab.) ;
  • 김동욱 (광운대학교 Digital Design & Test Lab.)
  • 발행 : 2004.07.01

초록

본 논문에서는 고속의 곱셈-누적 연산을 수행할 수 있는 새로운 MAC(Multiplier- Accumulator)의 구조를 제안하였다. 부분 곱의 생성을 위해서 1의 보수 기반의 고속 Booth 알고리즘(Modified Booth Algorithm, MBA)를 이용하였고 다수의 부분 곱을 더하기 위해서 CSA(Carry Save Adder)를 이용하였다. 부분 곱을 더하는 과정에서 Booth 인코딩 시 이용한 1의 보수 체계를 2의 보수 체계로 보상하고 이전 합과 캐리를 누적하는 연산을 수행하여 고속의 누적 연산이 가능한 구조를 제안한다. 또한 부분 곱의 덧셈에서 하위 비트들을 2 비트 CLA(Carry Look-ahead Adder)를 이용하여 연산함으로써 최종 덧셈기의 입력 비트수를 줄임으로써 전체적인 임계경로를 감소시켰다. 제안된 MAC을 JPEG2000을 위한 DWT (Discrete Wavelet Transform) 필터링 연산에 적용하여 고속의 디지털 신호처리가 가능함을 보였고 기존의 연구와 비교하여 향상된 성능을 보이는 것을 확인하였다.

In this paper, we proposed a new architecture of MAC(Multiplier-Accumulator) to operate high-speed multiplication-accumulation. We used the MBA(Modified radix-4 Booth Algorithm) which is based on the 1's complement number system, and CSA(Carry Save Adder) for addition of the partial products. During the addition of the partial product, the signed numbers with the 1's complement type after Booth encoding are converted in the 2's complement signed number in the CSA tree. Since 2-bit CLA(Carry Look-ahead Adder) was used in adding the lower bits of the partial product, the input bit width of the final adder and whole delay of the critical path were reduced. The proposed MAC was applied into the DWT(Discrete Wavelet Transform) filtering operation for JPEG2000, and it showed the possibility for the practical application. Finally we identified the improved performance according to the comparison with the previous architecture in the aspect of hardware resource and delay.

키워드

참고문헌

  1. J. J. F. Cavanagh, Digital Computer Arithmetic. New York: McGraw-Hill, 1984
  2. ISO/IEC 13818-1, 2, 3, Informational Technology-Coding of Moving Picture and Associated Autio, MPEG-2 Draft International Standard, 1994
  3. Martin Boliek, et al., JPEG 2000 Part I Final1191 Draft International Standard, ISO/IEC JTC1/SC29 WG1, 24 Aug. 2000
  4. O. L. MacSorley, 'High Speed Arithmetic in Binary Computers', Proc. IRE, vol. 49, Jan. 1961 https://doi.org/10.1109/JRPROC.1961.287779
  5. S.Waser and M. J. Flynn, Introduction to Arithmetic for Digital Systems Designers. New York: Holt, Rinehart and Winston, 1982
  6. A. R. Omondi, Computer Arithmetic Systems. Englewood Cliffs, NJ:Prentice-Hall, 1994
  7. Israel Koren, 'Computer Arithmetic Algorithms', John wiley Inc., pp. 71-123, 1993
  8. yoshita Harata, et al., 'A High-Speed Multiplier Using a Redundant Binary Adder Tree,' IEEE J. of Solide-State Circuits, Vol. sc-22, no. 1, pp.28-33, Feb 1987
  9. A. D. Booth, 'A Signed Binary Multiplication Technique', Quart. J. Math., vol. IV, pt. 2, 1952 https://doi.org/10.1093/qjmam/4.2.236
  10. C. S. Wallace, 'A Suggestion for a Fast Multiplier', IEEE Trans. Electron Comp., vol. EC-13, pp. 14-17, Feb. 1964 https://doi.org/10.1109/PGEC.1964.263829
  11. A. R. Cooper, 'Parallel architecture modified Booth multiplier,' IEE Proc.-G, vol. 135, pp. 125-128, 1988
  12. N. R. Shanbag and P. Juneja, 'Parallel implementation of a 4x4-bit multiplier using modified Booth's algorithm,' IEEE J. Solid-State Circuits, vol. 23, pp. 1010-1013, 1988 https://doi.org/10.1109/4.353
  13. G. Goto, T. Sato, M. Nakajima, and T. Sukemura, 'A 54x54 regular structured tree multiplier,' IEEE J. Solid-State Circuits, vol. 27, pp. 1229-1236, Sept. 1992 https://doi.org/10.1109/4.149426
  14. J. Fadavi-Ardekani, 'M NBooth encoded multiplier generator using optimized Wallace trees,' IEEE Trans. VLSI Syst., vol. 1, pp. 120-125, 1993 https://doi.org/10.1109/92.238424
  15. N. Ohkubo, M. Suzuki, T. Shinbo, T. Yamanaka, A. Shimizu, K. Sasaki, and Y. Nakagome, 'A 4.4 ns CMOS 5454 multiplier using pass-transistor multiplexer,' IEEE J. of Solid-State Circuits, vol. 30, pp. 251-257, Mar. 1995 https://doi.org/10.1109/4.364439
  16. A. Tawfik, F. Elguibaly, and P. Agathoklis, 'New realization and implementation of fixed-point IIR digital filters,' J. Circuits, Syst., Comput., vol. 7, no. 3, pp. 191-209, 1997 https://doi.org/10.1142/S0218126697000140
  17. A. Tawfik, F. Elguibaly, M. N. Fahmi, E. Abdel-Raheem, and P. Agathoklis, 'High-speed area-efficient inner-product processor,' Can. J. Elec. Comput. Eng., vol. 19, pp. 187-191, 1994
  18. F. Elguibaly and A. Rayhan, 'Overflow handling in inner-product processors,' in Proc. IEEE Pacific Rim Conf. Communication, Computers, and Signal Processing, Victoria, B.C., Canada, Aug. 20-22, 1997, pp. 117-120 https://doi.org/10.1109/PACRIM.1997.619915
  19. F. Elguibaly, 'A Fast Parallel Multiplier-Accumulator Using The Modified Booth Algorithm', IEEE, Trans. on circuits and Systems, vol. 27, pp. 902-908, Sep. 2000 https://doi.org/10.1109/82.868458
  20. T. Sakurai and A. R. Newton, 'Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas,' IEEE J. Solid-State Circuits, vol. 25, pp. 584-594, Feb. 1990 https://doi.org/10.1109/4.52187