# Power-Delay Product Optimization of Heterogeneous Adder Using Integer Linear Programming

Sanghoon Kwak\*, Jeong-Gun Lee\*\*, Jeong-A Lee\*\*\*

## 정수선형계획법을 이용한 이종가산기의 전력-지연시간곱 최적화

곽상훈\*, 이정근\*\*, 이정아\*\*\*

#### Abstract

In this paper, we propose a methodology in which a power-delay product of a binary adder is optimized based on the heterogeneous adder architecture. We formulate the power-delay product of the heterogeneous adder by using integer linear programming(ILP). For the use of ILP optimization, we adopt a transformation technique in which the initial non-linear expression for the power-delay product is converted into linear expression. The experimental result shows the superiority of the suggested method compared to the cases in which only conventional adder is used.

#### 요 약

본 논문에서는 이종가산기구조에 근거한 이진가산기의 전력-지연시간곱의 최적화 방법론을 제안한다. 정수선형 계획법(Integer Linear Programming)에 의해 이종가산기의 전력-지연시간곱을 공식화하였다. 정수선형계획법 의 사용을 위하여 최초의 전력-지연시간곱의 비선형수식을 선형수식으로 변환하는 기법을 채택하였다. 또한, 제안 된 방법이 전력지연시간곱(Power-Delay Product)의 척도에서 기존가산기와 비교해 우월함을 실험결과를 통해 확인하였다.

► Keyword : 이종가산기(heterogeneous adder), 전력-지연시간곱(power-delay product), 정수선 형계획법(integer linear programming)

<sup>•</sup>제1저자 : 곽상훈 교신저자 : 이정아

<sup>•</sup> 투고일 : 2010. 06. 17, 심사일 : 2010. 07. 06, 게재확정일 : 2010. 07. 26.

<sup>\*</sup>서울대학교 전기공학부 박사후연구원 \*\* 한림대학교 컴퓨터공학과 조교수 \*\*\* 조선대학교 컴퓨터공학과 교수

#### I. Introduction

The Power-Delav Product(PDP) indicates а measurement of how much a digital circuit is power-effective by considering its delay performance together with its power consumption. The PDP optimization for a binary adder should be done for modern digital circuits instead of considering power and delav independently [1]. Mixing multiple implementation type of adder such as ripple-carry adder(RCA). carry-lookaehad adder(CLA), carrv-skim adder(CSKA) for the optimization of delay, area, power was introduced in the pat research[2][3].

In this paper, we formulate an ILP model for the PDP based on heterogeneous adder architecture. The formulation of delay and power of the heterogeneous adder is acquired from power and delay formulation of the heterogeneous adder introduced in [4]. Simply multiplying those twe twe twe twe twe ts indicating power and delav of the heterogeneous adder removeay of validiogeneoILP formulation beca pow of multiplication of those twe twe twe ts givef th an heterogeneous addeincltding pSimply termh of the twe integer vro obles. Taking moou than onlatiteger vro obles in any singlu term of an ILP model removeay of heteroiogefrom of ILP model. This pre mulf th applying thf ILP power and neooptimize the PDP of the heterogeneous adder. Thus us adder is ower and neotiplyinrmulation of the-linear expression of the PDP to the linear expression.

By using the heterogeneous adder architecture and the non-linear to linear transformation scheme, more optimized PDP can be achieved when compared to that of conventional adder architecture. It is due to the exploration of expanded PDP design space for the heterogeneous adder as we observed in the area-constrained delay optimization or vice versa [5]. The optimization of the PDP of the heterogeneous adder is performed by a linear program solver [6]. The reduction of PDP in the heterogeneous adder compared to the PDP of conventional adder, will be shown in experimental results.

#### II. Backgrounds

Generalized architecture and delay modeling of the heterogeneous adder is illustrated in Fig.1[3].  $SA_i(n_i)$  indicates a sub-adder with its propagation scheme SAi and its bit-width ni. With available I-sub-adders. the *n*-bit heterogeneous adder is defined as concatenation of each sub-adder  $SA_i(n_i)$ . where  $\sum n_i = n$ . The carry-out signal of SA<sub>i</sub>,  $C_{out}(SA_i)$ , is used as the carry-in signal of SAi+1, Cin(SAi+1). Combining each sub-adder  $SA_i(n_i)$  and varying ni for each  $SA_i$  enables us to explore more fine-grained design space for the performance metrics such as delay, area and power than in that of conventional adder [4], [5]. Deciding the proper ni of each sub-adder  $SA_i(n_i)$  for PDP optimization in the heterogeneous architecture is the main goal of the approach presented in this paper.

The metric of PDP is meaningful especially in digital signal processing application and mobile system. It is due to that not only the power consumption of the system but also the high speed of operation is required in those systems [7].



Fig. 1 Generalized architecture and delay modeling of a heterogeneous adder



As the value of PDP of a system becomes smaller, the system becomes more power-effective, which means it consumes lower power with the same speed of the system operation.

For a specific implementation of a binary adder type, the PDP can vary with type of implementation, the degree of optimization, and the process technology for the implementation. Generally, Carry Lookahead Adder (CLA) is known to be most superior in the metric of PDP [8]. Figure 2 shows the PDP of actually implemented binary adder with 0.18m CMOS library with varying their bit-width. The value of PDP becomes smaller with the order of Ripple Carry Adder (RCA), Carry Skip Adder (CSKA), and CLA at the bit-width 128. It is shown in Fig. 2. This comparison implicates that, although CLA has larger power consumption compared to those of CSKA and RCA the delay decrease due to using carry lookahead architecture compensates the increase of power consumption. It indicates the CLA is most power-efficient when considering its delay.

Thus for an application which requires low power consumption together with performance in the speed, CLA is most appropriate among the adder types shown in Fig. 2. As the bit-width of each sub-adder becomes larger, the PDP of each sub-adder becomes larger too. RCA has always larger PDP than those of CLA, CSKA. However, for CLA the PDP is smaller when its bit-width is lower than 64. At the bit-width 128, CLA has the smallest PDP.

By using the heterogeneous adder architecture, we can exploit heterogeneous adder designs in the design space represented by the area between each PDP curve.

### III. ILP Formulation for PDP Optimization of Heterogeneous Adder

The heterogeneous adder architecture is presented in [5] and is applied to delay-constrained power optimization using ILP in [4]. In this section, the ILP formulation for the PDP optimization of the heterogeneous adder will be proposed. Specifically, the ILP formulation for area-constrained PDP optimization will be presented since delay/power of digital circuit is usually in tradeoff relationship with its area.

For the ILP formulation "transforming a non-linear expression into a linear expression" is required since the original PDP expression acquired by multiplying delay and power of a heterogeneous adder give us non-linearity. The average power consumption and the delay of the heterogeneous adder can be represented in the form of integer linear expression. The PDP of the heterogeneous adder can be expressed by the product of each integer linear expression representing the power consumption and delay of the heterogeneous adder, respectively.

As presented in [4], POWER(Heterogeneous Adder) and AREA(Heterogeneous Adder) can be expressed as follows:

POWER(Heterogeneous Adder) =

$$\sum_{i=1}^{I}\sum_{n_i=0}^{n}P_{n_i}^{SA_i} \times x_{n_i}^{SA_i}$$
(1)

AREA(Heterogeneous Adder) =

$$\sum_{i=1}^{I}\sum_{n_i=0}^{n}A_{n_i}^{SA_i} \times x_{n_i}^{SA_i}$$
(2)

Equation (1) and (2) are subject to  $x_{n}^{SA_i} \leq 1$ .

In the above Equation (1) and (2),  $P_{n_i}^{SA_i}$  and  $A_{n_i}^{SA_i}$  are the power consumption and area of i-th type sub-adder, respectively and  $x_{n_i}^{SA_i}$  is a binary integer variable taking values of 0 or 1. The inequality constraint  $x_{n_i}^{SA_i} \leq 1$  means at most one bit-width is selected for each type of sub-adder.

The order of sub-adder has impact on the delay of a heterogeneous adder. Depending on the order of sub-adders, the carry generation of sub-adders located in the most significant bit (MSB) part can overlap the sum generation of sub-adders located in the least significant bit (LSB) part as shown in Fig. 1. The order of sub-adder is fixed such that  $SA_{l}$  = CLA,  $SA_2$  = CSKA, and  $SA_3$  = RCA. By fixing the order of sub-adder, we can reduce the design space of ILP for PDP optimization since the order minimize delay of heterogeneous adder with the same combination of sub-adder.

Therefore, the delay of the heterogeneous adder is defined as follows :

DELAY(Heterogeneous Adder) = max{ $D_{l}, D_{2}, ..., D_{l}$ } Here,  $D_{l}$ , and  $D_{i}$  are defined as follows :

$$D_{1} = \sum_{n_{i}=0}^{n} D_{n_{i},S}^{SA_{i}} \times x_{n_{i}}^{SA_{i}}, (i = 1)$$

$$D_{i} = \sum_{k=2n_{k-1}=0}^{i} D_{n_{i},S}^{SA_{k-1}} \times x_{n_{i}}^{SA_{k-1}} + D_{n_{i},S}^{SA_{i}} \times x_{n_{i}}^{SA_{i}}),$$

$$(1 < i \leq I)$$

In (3) and (4),  $D_{n_i,S}^{SA_i}$  and  $D_{n_i,C}^{SA_i}$  indicate the sum delay and the carry delay of i-th type of sub-adder, respectively. Also,  $x_{n_i}^{SA_i}$  is a binary integer variable as in the case of power and area modeling.

Thus area-constrained PDP optimization is formulated as follows :

$$(n_1, n_2, ..., n_I) \min \{v_{PDP}\}$$
  
under constraints

In the above expressions,  $\Theta_{AREA}$  denotes the upper bound of area allowed for PDP optimization of the heterogeneous adder instance.  $v_{PDP}$  is a variable indicating the upper bound of PDP, and it is used also the minimax objective in ILP formulation for area-constrained PDP optimization[9]. Thus, the PDP of the heterogeneous adder can be modeled as follows:

PDP(Heterogeneous Adder) = max{PDP1, ..., PDP1}

$$\begin{split} PDP_{j} &= (\sum_{i=1}^{I} \sum_{n_{i}=0}^{n} P_{n_{i}}^{SA_{i}} \times x_{n_{i}}^{SA_{i}}) (\sum_{k=2n_{k-1}=0}^{I} D_{n_{k-1},C}^{SA_{k-1}} \\ &\times x_{n_{k-1}}^{SA_{k-1}} + \sum_{n_{j}=0}^{n} D_{n_{j},S}^{SA_{j}} \times x_{n_{j}}^{SA_{j}}), (1 = 0 \\ &= \sum_{i=1n_{i}=0}^{I} \sum_{k=2n_{k-1}=0}^{n} (\sum_{n_{i}=0}^{j} \sum_{n_{i}}^{n} P_{n_{i}}^{SA_{i}} \times D_{n_{k-1},C}^{SA_{k-1}} \times x_{n_{i,k-1}}^{SA_{i,k-1}} \\ &+ \sum_{n_{j}=0}^{n} P_{n_{i},S}^{SA_{i}} \times D_{n_{j},S}^{SA_{j}} \times x_{n_{i,j}}^{SA_{i,j}}), (1 < j \leq I) \quad \cdots \cdots (7) \\ &\text{where } x_{n_{i,j}}^{SA_{i,j}} = x_{n_{i}}^{SA_{i}} \cdot x_{n_{j}}^{SA_{j}} \text{ in Equation (6) and (7)} \end{split}$$

The index nj is used in the term  $D_{n_i S}^{SA_j}$  to distinguish the

index ni of the previous term  $P_{n_i}^{SA_i} \times x_{n_i}^{SA_i}$  in Equation (6). However, both SAi and SAj indicate the same sub-adder instance assigned to an heterogeneous adder instance since Equation (6) implies the case in which only one type of sub-adder is assigned.

In Equation (6) and (7), we define new variable  $x_{n_{i,j}}^{SA_{i,j}}$  to remove non-linearity induced by multiplying two linear equations in the ILP formulation for the PDP optimization. In the original formulation, there exists the terms  $x_{n_i}^{SA_i} \cdot x_{n_j}^{SA_j}$  in the expression for PDP optimization. In the term  $x_{n_i}^{SA_i} \cdot x_{n_j}^{SA_j}$ , we have two variables of ILP formulation for the PDP optimization. Thus, by substituting  $x_{n_i}^{SA_i} \cdot x_{n_j}^{SA_j}$  with a new variable,  $x_{n_{i,j}}^{SA_{i,j}}$  and introducing additional constraints, we can get the proper formulation to solve ILP fit for the PDP optimization of the heterogeneous adder.

In other words, following condition should be satisfied :

$$x_{n_{i,j}}^{S\!A_{i,j}}\!\!\! \Leftrightarrow\!\! (x_{n_i}^{S\!A_i}=\!1\, \mathrm{and}\, x_{n_j}^{S\!A_j}=\!1),$$

where  $x_{n_i}^{SA_i}$  and  $x_{n_j}^{SA_j}$  are binary variables.

To make the above condition satisfied, following additional constraints are required [9].

$$egin{aligned} &-x_{n_i}^{SA_i}+x_{n_{i,j}}^{SA_{i,j}} &\leq 0 \ &-x_{n_j}^{SA_j}+x_{n_{i,j}}^{SA_{i,j}} &\leq 0 \ &x_{n_i}^{SA_i}+x_{n_j}^{SA_j}-x_{n_{i,j}}^{SA_{i,j}} &\leq 1 \end{aligned}$$

By incorporating the newly defined variable and additional constraints, we can acquire following ILP formulation of the PDP of the heterogeneous-adder.

$$\underset{(n_1,n_2,...,n_I)}{\operatorname{arg}}\min\{v_{PDP}\}$$

$$1: \sum_{n_{i}=0}^{n} \sum_{n_{j}=0}^{n} P_{n_{j}}^{SA_{j}} \times D_{n_{j},S}^{SA_{j}} \times x_{n_{i},n_{j}}^{SA_{i,j}}, \text{ for all } SA_{j}, (1 \le i \le I, SA_{i} = SA_{j})$$
  
$$2: \sum_{i=1}^{I} \sum_{n_{i}=0}^{n} (\sum_{k=2n_{k-1}=0}^{j} P_{n_{i}}^{SA_{i}} \times D_{n_{k-1},C}^{SA_{k-1}} \times x_{n_{i,k-1}}^{SA_{i,k-1}} + \sum_{n_{j}=0}^{n} P_{n_{i},S}^{SA_{j}} \times D_{n_{j},S}^{SA_{j}} \times x_{n_{i,j}}^{SA_{i,j}}), (1 < j \le I)$$
  
$$3: (\sum_{i=1}^{I} \sum_{n_{i}=0}^{n} A_{n_{i}}^{SA_{i}} \times x_{n_{i}}^{SA_{i}}) \le \Theta_{AREA}$$
  
$$4: \sum_{n_{i}=0}^{n} \sum_{n_{i}}^{SA_{i}} x_{n_{i}}^{SA_{i}} = n \text{ for}$$
  
$$5: \sum_{n_{i}=0}^{n} \sum_{i=1}^{I} x_{n_{i}}^{SA_{i}} \times SA_{i} = n$$

for all  $SA_i$ ,  $SA_j$ ,  $n_i$ , and  $n_j$   $(1 \le i \le I, 1 \le j \le I, 1 \le n_i \le n_i$  $n_i$  and  $1 \le n_j \le n_i$ 

$$\begin{split} & 6 := x_{n_i}^{SA_i} + x_{n_{i,j}}^{SA_{i,j}} \leq 0 \\ & 7 := x_{n_j}^{SA_j} + x_{n_{i,j}}^{SA_{i,j}} \leq 0 \\ & 8 : x_{n_i}^{SA_i} + x_{n_j}^{SA_j} - x_{n_{i,j}}^{SA_{i,j}} \leq 1 & \cdots \end{pmatrix}$$

#### IV. Experimental Results

To show the effectiveness of the proposed method, the experiment for PDP optimization was performed with the derived ILP models. For the experiment, three types of sub-adders,  $CLA(=SA_i)$ ,  $CSKA(=SA_2)$ , and  $RCA(=SA_2)$ , were used, and their sizes varied from 4-bits to 128-bits with an incremental step of 4-bit. All the sub-adder instances were implemented by Synopsys tool with ANAM 0:18m CMOS library [10]. The delay and the average power consumption were obtained using timing and power simulation results of the tool.

In Fig. 3, the PDP design space generated by the combination of all the possible bit-width of sub-adders (here, I=3) is depicted as the form of 3-dimensional surface curve. The X-axis and the Y-axis indicate the bit-width of CLA and CSKA assigned to a heterogeneous adder, respectively. Z-axis means the PDP value at a specific point designated by each sub-adder combination. For



Fig. 3. PDP design space of the heterogeneous adder covered by the area upper bound, 1270. (Unit : # of NAND gates)

example, when  $X = n_1 = 128$  and  $Y = n_2 = 0$ , the remaining  $n_3$  becomes 0 and the corresponding PDP value is 8.911pJ. Finding a solution of the ILP formulation for PDP optimization implies seeking the lowest value point in this 3-dimensional graph. As shown in Fig. 3, the PDP of 128bit–CLA is lowest in the whole design space.

Figure 4 shows the result of PDP optimization while increasing area upper bound by 25. The unit of PDP is pJ since the multiplication of delay (nS) and average power consumption (uW) has the same unit as that of energy. Without any area upper bound, the optimized PDP is acquired at the sub-adder combination, 128-bit CLA. It means that CLA is most beneficial in the measurement of PDP among three types of sub-adders.

Also, in Fig. 4, a combination of sub-adders with the bit-width found by ILP optimization is given with a pair of optimized PDP and actual area at the point. For example, at the area upper bound 2300, the pair of the optimized PDP value and the area at that point, is represented in the parenthesis as (11.381, 2290) with the combination of sub-adders 'CLA112+CSKA12+RCA4'. In the interval of area upper bound,  $\Theta_{AREA}$  < 1200, the combination of CSKA and RCA (without CLA) is solely used for the optimized PDP. At the area upper bound 1200, the optimized value of the PDP of the heterogeneous adder is 13.649pJ with the actual area 1164. Here, operator '+'means the concatenation the of the sub-adder. In the interval of area upper bound, 1175 <  $\theta_{AREA} \leq 2100$ , the heterogeneous adder is configured to 'CSKA124+RCA4' as shown in Fig. 4.

Figure 3 explains why 'flat curve' appears in the interval,  $1175 < AREA \le 2100$ , of the X-axis in Fig. 4. A cutting plane,  $CP_i$  is created from the sub-adder combinations for the heterogeneous adders with the same area. The  $CP_i$  and  $CP_2$  in Fig. 3, indicates the sub-adder combinations with their area 1175 and 2100, respectively. Wherever a new  $CP_i$  indicating an area upper bound, is made between  $CP_1$  and  $CP_3$  the lowest PDP point in  $PDS_2$  will remain at the sub-adder combination 'CSKA124+ RCA4' in  $PDS_3$  as shown in Fig. 3.



Area Upper Bound (# of NAND gate) : Step 25

Fig. 5. Reduction of PDP by the heterogeneous adder in area-constrained optimization.

Figure 5 displays the reduction of PDP, which means that the ratio of the reduced PDP in area-constrained optimization bv using the heterogeneous adders instead of using conventional adders. In the interval,  $675 < \Theta_{AREA} \leq 1200$ , upto 57% of PDP reduction is acquired, and in the interval. 1200  $\leq \Theta_{ABEA} \leq 2125$ , about 3% PDP reduction was obtained. In the interval,  $2125 < \Theta_{ABEA}$  $\leq$  2500, upto 35% of PDP reduction is acquired. The improvement numbers are not absolute since this improvement is from the areas/delays/powers of the specific sub-adder implementations. However. the improvement would be changed relatively, if other circuit level optimization such as transistor sizing is applied to sub-adder types or different design libraries is used in implementing sub-adder components.

#### V. Conclusions

In this paper, the ILP formulation for PDP of heterogeneous adder is presented and the experimental results of optimizing PDP of heterogeneous adder are provided. The technique to transform a non-linear expression to a linear expression is also adopted for ILP based PDP formulation of the heterogeneous adder. Without that transformation, PDP of the heterogeneous adder can not be modeled in ILP form due to the non-linearity property of the original PDP formulation.

The experimental result showed the optimized PDP values of the heterogeneous adders under area constraints. Through the use of the proposed methodology, the compromised design space of the heterogeneous adder can also be exploited for the case of PDP optimization.

In future research, we plant to extend the proposed method to work in different input arrival time for each input bit of a binary adder.

#### Acknowledgement

This study was supported by research fund from Chosun University, 2004

#### References

- [1] Nêve, A., Schettler, H., Ludwig, T., and Flandre, D.: "Power-Delay Product Minimization in High -Performance 64-bit Carry-Select Adders," IEEE Transactions on VLSI Systems, Vol. 12, pp. 235-244, Mar. 2004.
- [2] Zhu, Y., Liu, J., Zhu, H., Cheng, C. K.: "Timing-Power Optimization for Mixed-Radix Ling Adders by Integer Linear Programming," IEEE Transactions on VLSI Systems, Vol. 12, pp. 235–244, Mar. 2004.
- [3] Das, S., Khatri, S. P., : "Generation of the Optimal Bit-Width Topology of the Fast Hybrid Adder in a Parallel Multiplier," Proc. of the 13th International Conference of Integrated Circuit Design and Technology , pp. 49–54, May. 2007.
- [4] Kwak, S., Har, D., Lee, J., and Lee, J.: "Design of Heterogeneous Adders Based on Power-Delay Tradeoffs," Proc. of the 5th IEEE International Symposium of Embedded Computing, pp.223–226, Bejing, China, Oct. 2008.
- [5] Lee, J., Lee, J., Kim, S., and Kim, K.: "Design of Mutated Adder and Its Optimization Using ILP Formulation," IEICE Transactions on Information and Systems, Vol. E88–D, No.7, pp.1506–1508 Jul. 2007.
- [6] http://lpsolve.sourceforge.net/5.0/index.htm
- [7] Nagendra, C., Irwin, M., and Owens, R: "Power-Delay Characteristics of CMOS Adders," IEEE Transactions on VLSI Systems, Vol. 2, No. 3. pp. 377–381, Sept. 1994.
- [8] Nagendra, C., Irwin, M., and Owens, R., "Areatime-power tradeoffs in parallel adders," IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, Vol. 43, pp. 689–702, Oct. 1996.
- [9] Williams, H., "Model Building in Mathematical Programming," John Wiley, 1999, 4th Ed.
- [10] Synopsys Corporation, "Datasheet : ANAM 0.18 micron, 1.8 volt Optimum Silicon SC Library," Aug.

### 저 자 소 개

#### 곽 상 훈



2000년 2월: 광주과학기술원 공학석사 2009년 2월: 광주과학기술원 공학박사 2009년 8월~현재: 서울대학교 전기공학부 박사후연구원 관심분야: VLSL/CAD, 비동기회로설계 방법론, 멀티프로세서SoC



#### 이 정 근

2005년 2월: 광주과학기술원 공학박사 2005년 5월~2007년 6월: University of Cambridge, 컴퓨터 연구소, 박사후연구원 2007년 7월~2008년 2월: 광주과학기술원 정보통신공학과 연구교수 2008년 3월~현재: 한림대학교 컴퓨터공학과 조교수 관심분야: 임베디드시스템설계, 비동기 회로설계, 컴퓨터구조



#### 이 정 아

1982년 : 서울대학교 공학사 1985년 : 인디애나 주랍대학교 공학석사 1990년 : 캘리포니아 주립대학교 (UCLA) 공학박사 1990년 ~ 1995년 : Assistant Professor, University of Houston USA 1995년 ~ 현재 : 조선대학교 컴퓨터공학과 교수 관심분야: 컴퓨터 구조, 고속 디지털 연산기, 특수 용도의 VLSI 구조