An Implementation of Low Power MAC using Improvement of Multiply/Subtract Operation Method and PTL Circuit Design Methodology

승/감산 연산방법의 개선 및 PTL회로설계 기법을 이용한 저전력 MAC의 구현

  • Sim, Gi-Hak (Dept.of Electronics Engineering Chung Buk, National University) ;
  • O, Ik-Gyun (Dept.of Electronics Engineering Chung Buk, National University) ;
  • Hong, Sang-Min (Dept.of Electronics Engineering Chung Buk, National University) ;
  • Yu, Beom-Seon (Dept.of Electronics Engineering Chung Buk, National University) ;
  • Lee, Gi-Yeong (Dept.of Electronics Engineering Chung Buk, National University) ;
  • Jo, Tae-Won (Dept.of Electronics Engineering Chung Buk, National University)
  • 심기학 (충북대학교 전자공학과) ;
  • 오익균 (충북대학교 전자공학과) ;
  • 홍상민 (충북대학교 전자공학과) ;
  • 유범선 (충북대학교 전자공학과) ;
  • 이기영 (충북대학교 전자공학과) ;
  • 조태원 (충북대학교 전자공학과)
  • Published : 2000.04.01

Abstract

An 8$\times$8+20-bit MAC is designed with low power design methodologies at each of the system design levels. At algorithm level, a new method for multipl $y_tract operation is proposed, and it saves the transistor counts over conventional methods in hardware realization. A new Booth selector circuit using NMOS pass-transistor logic is also proposed at circuit level. It is superior to other circuits designed by CMOS in power-delay-product. And at architecture level, we adopted an ELM adder that is known to be the most efficient in power consumption, operating frequency, area and design regularity as the final adder. For registers, dynamic CMOS single-edge triggered flip-flops are used because they need less transistors per bit. To increase the operating frequency 2-stage pipeline architecture is adopted, and fast 4:2 compressors are applied in Wallace tree block. As a simulation result, the designed MAC in 0.6${\mu}{\textrm}{m}$ 1-poly 3-metal CMOS process is operated at 200MHz, 3.3V and consumed 35㎽ of power in multiply operation, and operated at 100MHz consuming 29㎽ in MAC operations, respectively.ly.

시스템 설계의 각 단계에서 저전력 설계기법을 적용하여 8×8+20비트의 MAC을 설계하였다. 알고리듬레벨에서는 MAC의 중요한 명령어 중의 하나인 승/감산연산을 위한 하드웨어의 설계에서 기존의 방식에 비하여 트랜지스터를 감소할 수 있는 새로운 기법을 제안하였으며, 회로 레벨에서는 동일한 로직을 CMOS로 구현한 경우보다 PDP(power-delay-product) 측면에서 우수한 성능을 가지는 NMOS pass-transistor 로직으로 구성된 새로운 Booth 셀렉터 회로를 제안하였다. 구조 레벨에서 최종단 덧셈기는 전력소모, 동작속도, 면적, 설계 규칙성 측면에서 가장 우수한 ELM 덧셈기를 사용하였고, 레지스터는 비트당 트랜지스터의 수가 적은 동적 CMOS 단일모서리 천이 플립플롭을 적용하였다. 동작속도를 높이기 위한 방법으로는 2단 파이프라인 구조를 적용했으며, Wallace 트리 블록에 고속 4:2 압축기를 이용하였다. 0.6㎛ 단일폴리, 삼중금속 CMOS 공정으로 설계된 MAC은 모의실험 결과 곱셈 연산시 최대 200㎒ 3.3V에서 35㎽의 전력을 소모하였고, MAC 연산시 최대 100㎒에서 29㎽의 전력을 소모하였다.

Keywords

References

  1. Akilesh Parameswar, H. Hura and T. Sakurai, 'A Swing Restored Pass-Transistor Logic-Based Multiply and Accumulate Circuit for Multimedia Applications,' IEEE Journal of Solid-State Circuits, vol. 31, no. 6, pp. 804-809, Jun. 1996 https://doi.org/10.1109/4.509866
  2. Dinesh Somasekhar and V. Visvanathan, 'A 230-MHz Half-Bit Level Pipelined Multiplier Using True Single-Phase Clocking,' IEEE Trans. on VLSI Systems, vol. 1, no. 4, pp. 415-422, Dec. 1993 https://doi.org/10.1109/92.250188
  3. Shyh-Jeh Jou, C. Chen, E. Yang and C. Su, 'A Pipelined MAC using a High-Speed, Low-Power Static and Dynamic Full Adder Design,' IEEE Journal of Solid-State Circuits, vol. 32, no. 1, pp. 114-118, Jan. 1997 https://doi.org/10.1109/4.553190
  4. M. Alidina et al., 'DSP16000: A High Performance, Low Power Dual MAC DSP Core for Communication Applications,' IEEE Custom Integrated Circuits Conference, pp. 119-122, 1998 https://doi.org/10.1109/CICC.1998.694919
  5. Ram K. Krishnamurthy, H. Schmit and L. R. Carley, 'A Low-Power 16-bit MAC using Series-Regulated Mixed Swing Techniques,' IEEE Custom Integrated Circuits Conference, pp. 499-502, 1998
  6. Hiroaki Murakami et al. 'A Multiplier-Accumulator Macro for a 45MIPS Embedded RISC Processor,' IEEE Journal of Solid-State Circuits, vol. 31, no. 7, pp. 1067-1071, Jul. 1996 https://doi.org/10.1109/4.508224
  7. Xiaoping Huang, W. Liu and B. W. Y. Wei, 'A High-Performance CMOS Redundant Binary Multiplication-and-Accumulation (MAC) Unit,' IEEE Trans. On Circuit and Systems, vol. 41, no. 1, pp. 33-39, Jan. 1994 https://doi.org/10.1109/81.260217
  8. Aamir A. Farooqui and Vojin G. Oklobdzija, 'General Data-Path Organization of a MAC unit for VLSI Implementation of DSP Processors,' IEEE International Symposium on Circuits and Systems, 1998
  9. Norio Ohkubo et al., 'A 4.4ns CMOS 54x54-b Multiplier Using Pass-Transistor Multiplexer,' IEEE Journal of Solid-State Circuits, vol. 30, no. 3, pp. 251-257, Mar. 1995 https://doi.org/10.1109/4.364439
  10. Brian S. Checkaur and E. G. Friedman, 'A Hybrid Radix-4/Radix-8 Low Power signed Multiplier Architecture,' IEEE Trans. on Circuits and Systems, vol. 44, no. 8, pp. 656-659, Aug. 1997 https://doi.org/10.1109/82.618039
  11. Issam S. Abu-Khater, Abdellatif Bellaouar and Mohamed I. Elmasry, 'Circuit Techniques for CMOS Low-Power High-Performance Multiplier,' IEEE Journal of Solid-State Circuits, vol. 31, no. 10, pp. 1535-1546, Oct. 1996 https://doi.org/10.1109/4.540066
  12. Hiroshi Makino et al., 'An 8.8-ns 54 X 54-Bit Multiplier with High Speed Redundant Binary Architecture,' IEEE Journal of Solid-State Circuits, vol. 31, no. 6, pp. 773-783, Jun. 1996 https://doi.org/10.1109/4.509863
  13. Jalil Fadavi-Ardekani, 'MxN Booth Encoded Multiplier Generator Using Optimized Wallace Trees,' IEEE Trans. on VISI Systems, vol. 1, no. 2, pp. 120-125, Jun. 1993
  14. Junji Mori et al., 'A 10-ns 54 X 54-b Parallel Structured Full Array Multiplier with O.5-um CMOS Technology,' IEEE Journal of Solid-State Circuits, vol. 26, no. 4, Apr. 1991
  15. Gensuko Goto et al., 'A 4.1-ns Compact 54 x 54-b Multiplier Utilizing Sign-Select Booth Encoders,' IEEE Journal of Solid-State Circuits, vol. 32, no. 11, pp. 1676-1682, Nov. 1997 https://doi.org/10.1109/4.641687
  16. Vojin G. Oklobdzija and D. Villerger, 'Improving Multiplier Design by Using Improved Column Compression Tree and Optimized Final Adder in CMOS Technology,' IEEE Trans. on VLSI Systems, vol. 3, no. 2, pp. 292-301, Jun. 1995 https://doi.org/10.1109/92.386228
  17. T. P. Kelliber, R. M. Owens, M. J. Irwin and T. T. Hwang, 'ELM-A Fast Addition Algorithm Discovered by a Program,' IEEE Trans. on Computers, vol. 41, no. 9, Sep. 1992 https://doi.org/10.1109/12.165399
  18. Chetana Nagendra, M. J. Irwin and R. M. Owens, 'Area-Time-Power Tradeoffs in Parallel Adders,' IEEE Trans. on Circuits and Systems, vol. 43, no. 10, Oct. 1996 https://doi.org/10.1109/82.539001
  19. DSP/MSP Products Reference Manual, Analog Devices, 1995
  20. Daniel D. Gajski, Principles of Digital Design, Prentice-Hall, pp. 176-177, 1997
  21. M. Morris Mano, Computer System Architecture 3rd Ed., Prentice-Hall, pp. 102-105, 1993
  22. Neil H. E. Weste and Kamran Eshraghian, Principles of CMOS VLSI Design: A Systems Perspective 2nd Ed., Addison Wesley, pp. 515-520, 1993
  23. Reto Zimmermann and Wolfgang Fichtner, 'Low-Power Logic Styles: CMOS Versus Pass-Transistor Logic,' IEEE Journal of Solid-State Circuits, vol. 32, no. 7, pp. 1079-1090, Jul. 1997 https://doi.org/10.1109/4.597298
  24. David Moloney, J. O'Brien, E. O'Rourke and F. Brianti, 'Low-Power 200-Msps, Area-Efficient, Five-Tap Programmable FIR Filter,' IEEE Journal of Solid-State Circuits, vol. 33, no. 7, pp. 1134-1138, Jul. 1998 https://doi.org/10.1109/4.701282
  25. Anantha P. Chandrakasan and Robert W. Broderson, Low Power Digital CMOS Design, Kluwer Academic Publishers, pp. 249-253, 1995
  26. Gary K. Yeap, Practical Low Power Digital VLSI Design, Kluwer Academic Publishers, pp. 104-107, 1998
  27. Jan M. Rabaey and Massoud Pedram, Low Power Design Methodologies, Kluwer Academic Publishers, pp. 47-106, 1996
  28. Rafael P. Llopis and Manoj Sachdev, 'Low Power, Testable Dual Edge Triggered Flip-Flops,' International Symposium on Low Power Electronics and Design, 1996
  29. Gensuko Goto, T. Sato, M. Nakajima and T. Sukemura, 'A 54 X 54-b Regularly Structured Tree Multiplier,' IEEE Journal of Solid-State Circuits, vol. 27, no. 9, pp. 1229-1236, Sep. 1992 https://doi.org/10.1109/4.149426
  30. Abdellatif Bellaouar and Mohamed I. Elmasry, Low-Power Digital VLSI Design: Circuits and Systems, Kluwer Academic Publishers, pp. 442-450, 1995
  31. Marco Annaratone, Digital CMOS Circuit Design, Kluwer Academic Publishers, pp. 216-226, 1986