# A Low Power 16-Bit RISC Microprocessor Using ECRL Circuits

Youngjoon Shin, Chanho Lee, and Yong Moon

This paper presents a low power 16-bit adiabatic reduced instruction set computer (RISC) microprocessor with efficient charge recovery logic (ECRL) registers. The processor consists of registers, a control block, a register file, a program counter, and an arithmetic and logical unit (ALU). Adiabatic circuits based on ECRL are designed using a 0.35 µm CMOS technology. An adiabatic latch based on ECRL is proposed for signal interfaces for the first time, and an efficient four-phase supply clock generator is designed to provide power for the adiabatic processor. A static CMOS processor with the same architecture is designed to compare the energy consumption of adiabatic and non-adiabatic microprocessors. Simulation results show that the power consumption of the adiabatic microprocessor is about 1/3 compared to that of the static CMOS microprocessor.

Keywords: Adiabatic circuit, low energy, low power, ECRL, microprocessor, latch, ALU.

#### I. Introduction

The development of CMOS technology provides high density and high performance to integrated circuits. As the density of an integrated circuit increases, the power consumption increases and its temperature control becomes difficult. Moreover, mobile devices require a high performance, light weight, and long operation time, which are contradictory characteristics. Adiabatic computing is an attractive approach in this viewpoint. In recent years, studies on adiabatic computing have been grown for low power systems and several adiabatic logic families have been proposed [1]-[7]. Efficient charge recovery logic (ECRL) is one of the adiabatic logic families and is useful for low energy systems [8]. An energy-recovery method based on the adiabatic technique uses an AC power supply, and an efficient supply clock generator is essential for the design of a low power system using adiabatic circuits.

Most of the previous works on adiabatic circuits have focused on building blocks such as adders and multipliers [6], [9] and their energy consumptions compared. The study on adiabatic circuits should not be limited to designing the building blocks. It is essential to design a complex system such as a microprocessor for the feasibility of adiabatic circuits. In order to implement a microprocessor, large macro blocks [10], random logic, and storage elements are necessary as building blocks. In particular, registers using adiabatic circuits have not yet been reported.

We design a 16-bit ECRL reduced instruction set computer (RISC) microprocessor including registers based on the adiabatic concept. The ECRL microprocessor consists of an arithmetic and logic unit (ALU), control block, register file, and a four-phase supply clock generator. It is designed using a 0.35 µm CMOS technology, and SPICE simulations are carried out on it using a layout extracted net-list.

Manuscript received Feb. 13, 2004; revised June 21, 2004.

This work was supported by the Soongsil University Research Fund.

Youngjoon Shin (email: syj2079@korea.com), Chanho Lee (phone: 02-820-0710, email: chlee@ssu.ac.kr), and Yong Moon (email: moony@ssu.ac.kr) are with the Department of Electronic Engineering, Soongsil University, Seoul, Korea.

The rest of this paper is organized as follows. Section II describes the structure of the basic ECRL circuit and ECRL latches. Section III shows the designed adiabatic microprocessor and structure of each macro block. Section IV describes the supply clock generator for the adiabatic microprocessor. Section V shows the simulation results of the operation and an energy comparison of the adiabatic microprocessor. Finally, section VI contains concluding remarks.

# II. ECRL Latch

The ECRL-type adiabatic logic family [8] is used in this paper and AC-type supply clocks are needed to supply energy. The adiabatic circuits require four-phase sinusoidal clocks for cascading logic stages. ECRL adiabatic circuits use a differential signal scheme. An ECRL inverter (buffer) is shown in Fig. 1.



Fig. 1. The basic structure of the ECRL circuit.

The ECRL circuits are operated in a pipelining style with the four-phase supply clocks. When the output is directly connected to the input of the next stage (which is a combinational logic), only one phase is enough for a logic value to propagate. However, when the output of a gate is fed back to the input, the supply clocks should be in phase. A latch is one of the simplest cases which have a feedback path. The input signals propagate to the next stage in a single phase, and the input values are stored in four phases (1-clock) safely. Figure 2 shows the basic structure of



Fig. 2. The schematic of an ECRL latch.

the proposed latch. The thick lines indicate two complementary signals. The forward NAND gate A1 can be replaced by a simple inverter when the reset function is not needed. When a DC input is applied, the differential sinusoidal waveforms can be obtained by propagating through an additional buffer. The signals from switching transistors, M1 and M2, drive the next stage through the NAND gate powered by  $\varphi_1$ . The buffer I1 for storing the signal is powered by  $\varphi_3$ , and the phase is in accordance with A1.

The buffer (11) in the feedback path performs the 'Precharge and Evaluation' in the 'Recovery' phase of the NAND gate (A1) powered by  $\varphi_1$ . The waveform of the node 'nodein' changes abruptly near the beginning and end of the waveform and oscillates according to the phase of  $\varphi_3$ . Figure 3 shows the waveform of the latch operation in which the input value is stored according to the 'enable' signal.



Fig. 3. The SPICE simulation result of the ECRL latch.

The energy consumption of a  $8 \times 16$ -bit shift register file with a power supply clock generator is compared with those of the CMOS true single-phase clocked logic shift registers with the same configuration. Figure 4 shows the energy consumption of the shift registers when the flipping periods of the input values are 10, 20, 40, and 80 ns, and when the 'enable' signal is



Fig. 4. The energy consumption of a  $8 \times 16$  -bit shift register file when the 'enable' signal is always on.



Fig. 5. The structure of an adiabatic microprocessor.

always '1'. The operation clock frequency is 100 MHz. Both CMOS and ECRL circuits consume less energy as the input flipping period is increased. The energy consumption of ECRL registers is about 52.1 to 63.7% of that of CMOS registers, and that of the power supply clock does not change significantly. A number of combination logic gates and memory elements are included in practical circuits, and only one power supply clock generator is necessary. Therefore, the energy consumption of the power supply clock generator is not significant in practical circuits, and the efficiency of energy consumption is increased. If a system is large enough, the energy consumption of ECRL circuits will be below 1/3 of that of CMOS circuits.

# III. Structure and Design of the Adiabatic Microprocessor

The proposed microprocessor consists of registers, a control unit, a program counter (PC), a register file, a shifter, and an ALU [12]. The block diagram of the proposed microprocessor is shown in Fig. 5.

Instructions and data are transferred to the processor through ECRL registers. The control block receives instruction codes from an ECRL register and generates suitable control signals with the matching operation phases. The PC value is determined by an auto incrementer or by a value in the register file.

The ALU consist of a 16-bit Brent-Kung adder, a 16-bit logical unit, a barrel shifter and a multiplexer. Figure 6 shows the block diagram of the ALU [6], [10]. The Brent-Kung adder block performs addition and subtraction, and the logical unit executes logic operations of 16-bit data A and B. The 16-bit



Fig. 6. The structure of the ALU and shifter.

barrel shifter is designed to be suitable for a pipelining structure. The shifter consists of four binary-weighted sub-blocks. The final result (Z) is determined by a 3-to-1 multiplexer with select signals from the control unit.

The PC block is composed of a 16-bit register based on ECRL latches, an auto incrementer, and an adder. Figure 7 shows the block diagram of the PC block. The adder unit adds the past value of the PC value and the second output "RD2" of the register file. Several instructions such as "BREAK" and "BRANCH" change the three control signals, brsrc, jr\_sel and bgez\_sel, which select the PC value. The PC value is determined among the output of the incrementer, the adder, "RD1" of the register file, and "C000," the address for an exception handler.

For the adiabatic microprocessor, the 3-read and 1-write multiport adiabatic register file is designed in an adiabatic manner to reduce energy, but is not applied to the storage cell. The storage cell array is designed using the 6-T SRAM cell [10]. The 6-T cell structure has advantages in its high operation speed and low power dissipation. The basic structure of a multi-port register file is shown in Fig. 8. Cross-coupled PMOS circuits are used as sensing circuits to recover the charges in bit-lines. The write



Fig. 7. The structure of the program counter.

operation needs two phases, and the read operation takes three. Since the read operation requires a one-phase delay from the completion of the write operation, reading updated data in the same cycle is possible in the proposed register file.

## IV. Design of Supply Clock Generator

An AC-type supply is necessary to return the delivered energy back to the supply efficiently [6], [11]. We use an LC resonant circuit to generate four-phase supply clocks. The supply clock generator consists of one inductor, two capacitors, and MOS switches. The structure of the supply clock generator is shown in Fig. 9. The synchronization of two LC oscillator-type supply clock circuits is controlled by four external clocks (CK0~CK3).

The drivability of the supply clock circuit is controlled by the size of the switch transistors. We use binary-weighted switching transistors as shown in Fig. 10. 'MOS width control' signals select any combination of the four driving trees, and the driving capability is determined according to the widths of the corresponding switching transistors. The widths of switching transistors, M1, M2, M3, and M4 are 5, 10, 20, and 40  $\mu$ m, respectively. These binary weighted widths of the transistors supply the proper current according to the operation frequency. The switching transistors are used to maintain the swing of supply clocks by compensating the energy loss.

The operating frequency of a microprocessor is determined



Fig. 8. The structure of the adiabatic register file.



Fig. 9. The structure of the supply clock generator.



Fig. 10. A schematic of the supply clock circuit.

|              | 1    | 1    | 11.5 |      |
|--------------|------|------|------|------|
| node         | S0   | S1   | S2   | S3   |
| $C_{eq}(pF)$ | 1.67 | 2.80 | 1.16 | 2.26 |
|              |      |      |      |      |

Table 1. The equivalent capacitances of supply clock nodes.

by the resonant frequency calculated from an external inductance and the equivalent capacitance of the supply clock node, which can be obtained using the equation  $T=RC_{eq}$ . After connecting a small resistor to a power node, a step input signal is applied to the resistor to measure the RC time constant. The equivalent node capacitance is calculated by dividing the time constant by the resistance. Table 1 shows the calculated equivalent capacitances of supply clock nodes for the proposed adiabatic microprocessor.

Since the equivalent capacitance of each supply clock node is different, small external capacitors are added to match the equivalent capacitances of the supply clock nodes. The amplitude of the supply clock grows as the DC power supplies current through the MOS switches.

#### V. Simulation Result and Layout

The energy dissipation of an ECRL inverter circuit is shown in Fig. 11. This figure shows that energy is recovered as the voltage of the supply clock goes down. The energy consumption is calculated by integrating the voltage and current product value.

The 16-bit adiabatic microprocessor is designed using a 0.35 µm CMOS technology, and SPICE simulations are carried out using layout extracted net-lists. A conventional CMOS microprocessor with the single-ended signal scheme is also designed to compare the power consumption. The CMOS microprocessor is designed by removing the ECRL registers and adding static CMOS registers in the adiabatic microprocessor.

The efficiency of the supply clock generator varies according to the size of the switching transistors, and is simulated by changing the combination of four binary-weighted switching transistors. The size of a switching transistor increases as the operation frequency goes up. This relation indicates that a larger current is necessary for a higher frequency operation. When the width of an NMOS switch transistor is 40  $\mu$ m, the efficiency is optimal at the operating frequency of 50 MHz.



Fig. 11. Energy recovery graph of an ECRL inverter.

Figure 12 shows the lower 9-bit operation results of two registers, R1 and R2, in the register file with test instruction codes that execute logical operations of AND, OR, XOR, NOR, and the arithmetic operations of addition and subtraction. R1 has (0000 0100 0000 0011)<sub>2</sub>, while R2 has (0000 0001 0000 1101)<sub>2</sub>.

The energy consumption of the ECRL and conventional CMOS microprocessors is compared in Fig. 13 [13], [14]. The energy of the microprocessors is compared for various operation frequencies without changing instruction codes. The energy consumption of the adiabatic microprocessor is about 1/3 of that of the static CMOS microprocessor.

The proposed 16-bit adiabatic RISC microprocessor is designed by a full-custom method using a  $0.35 \ \mu m$  one-poly and



Fig. 12. Simulation results of an adiabatic microprocessor.



Fig. 13. Energy consumption versus operation frequencies for CMOS and adiabatic microprocessors.

three-metal CMOS process. The layout of the proposed adiabatic processor is shown in Fig. 14. The size of the chip is  $4 \times 4$  mm.

#### VI. Conclusions

An adiabatic RISC microprocessor is designed using a 0.35 µm CMOS technology. It includes ECRL registers, which make up the first ECRL storage element. A conventional static CMOS microprocessor with the same structure is also designed for an energy comparison. The proposed adiabatic RISC microprocessor is designed based on ECRL circuits



Fig. 14. The chip layout of the 16-bit RISC adiabatic microprocessor.

except for the storage cell of the register file, and it includes an efficient power supply clock generator so that it works with an external DC power supply and four-phase square waves. We perform the energy and functional simulation using the extracted net-lists from the layout, and verify that the microprocessor works correctly. The energy consumption of the adiabatic microprocessor is improved by a factor of 3 compared to that of the conventional static CMOS microprocessor. The prototype design of a 16-bit RISC microprocessor with ECRL circuits shows the feasibility of adiabatic circuits in low power applications.

## References

- J.S. Denker, "A Review of Adiabatic Computing," *IEEE Symp. Low Power Electronics*, 1994, pp. 94-97.
- [2] A. Kramer, J.S. Denker, S.C. Avery, A.G Dickinson, and T.R. Wik, "Adiabatic Computing with the 2N-2N2D Logic Family," *Symp. VLSI Circuits Dig. of Tech. Papers*, 1994, pp. 25-26.
- [3] R.T. Hinman and M.F. Schlecht, "Power Dissipation Measurements on Recovered Energy Logic," *Symp. VLSI Circuits*, 1994, pp. 19-20.
- [4] A.G Dickinson and J.S. Denker, "Adiabatic Dynamic Logic," *IEEE J. Solid State Circuits*, vol. 30, 1995, pp. 311-315.
- [5] C.W. Kim, S.M. Yoo, and M.S. Kang, "Low-Power Adiabatic Computing with NMOS Energy Recovery Logic," *Electric Lett.*, vol. 36, no. 16, 2000, pp. 1349-1350.
- [6] H. Mahmoodi-Meinnand, A. Afzali-Kusha, and M. Nourani, "Adiabatic Carry Look-Ahead Adder with Efficient Power Clock Generator," *IEEE Proc.*, vol. 148, 2001, pp. 229-234.

- [7] L. Varga, F. Kovacs, and G Hosszu, "An Efficient Adiabatic Charge-Recovery Logic," *IEEE Proc.* Southeastcon, 2001, pp. 17-20.
- [8] Y. Moon and D.K. Jeong, "An Efficient Charge Recovery Logic Circuit," *IEEE J. Solid State Circuits*, vol. 31, no. 4, 1996, pp. 514-522.
- [9] R. Brent and H.T. Kung, "A Regular Layout for Parallel Adders," *IEEE Trans. Computers*, vol. C-31, no. 3, 1982, pp. 260-264.
- [10] Jan M, Rabaey, *Digital Integrated Circuits*, Prentice Hall, 1996, pp. 359-362, p. 209.
- [11] Joonho Lim, Kipaek kwon, and Soo-Ik Chae, "Reversible Energy Recovery Logic Circuit without Non-adiabatic Energy Loss," *Electronics Lett.*, vol. 34, Issue. 4, 1998, pp. 344-346.
- [12] H.S. Lee, I.H. Na, C. Leem, and Y. Moon, "A 16-Bit Adiabatic Macro Blocks with Supply Clock Generator for Micro-Power RISC Datapath," *ITC-CSCC 2002*, 2002, pp. 1563-1566.
- [13] Hyun-Gyu Kim, Dae-Young Jung, Hyun-Sup Jung, Young-Min Chio, Jung-Su Han, Byung-Gueon Min, and Hyeong-Cheol Oh, "AE32000B: a Fully Synthesizable 32-Bit Embedded Microprocessor Core," *ETRI J.*, vol. 25, no. 5, 2003, pp. 337-344.
- [14] Kyoung Park, Sung-Hoon Choi, Yongwha Chung, Woo-Jong Hahn, and Suk-Han Yoon, "On-chip Multiprocessor with Simultaneous Multithreading," *ETRI J.*, vol. 22, no. 4, 2000, pp. 13-24.



Youngjoon Shin received the BS and MS degrees in electronic engineering from Soongsil University in 2002 and 2004. Since 2004, he has worked for BOE HYDIS. His research interests are in low power systems and microprocessor architecture.



**Chanho Lee** received the BS and the MS degrees in electronic engineering from Seoul National University, Seoul, Korea, in 1987 and in 1989, and the PhD degree from the University of California, Loa Angeles, in 1994. In 1994, he joined the Semiconductor R&D Center of Samsung Electronics, Kiheung, Korea.

Since 1995, he has been a faculty member of the School of Electronic Engineering, Soongsil University, Seoul, Korea, and he is currently an Associate Professor. His research interests are in the design of a channel codec and security processor, MPEG codec, low power design, and on-chip network. He is a senior member of IEEE.



Yong Moon received the BS, MS, and PhD degrees in electronic engineering from Seoul National University in 1990, 1992, and 1997. From 1997 to 1999, he was a Senior Research Engineer at LG Semicon, Inc. (currently HYNIX). Since 1999, he has been on the electronic engineering faculty at Soongsil

University. His research interests are in low power circuits, PLL and CMOS RF circuits.