# SOC(System-On-a-Chip)에 있어서 효율적인 테스트 데이터 압축 및 저전력 스캔 테스트 Efficient Test Data Compression and Low Power Scan Testing for System-On-a-Chip(SOC) 정준모', 박병수" 한양사이버대학교 컴퓨터학과\*, 상명대학교 컴퓨터시스템공학과\*\* Jun-Mo Jung(jmjung@hycu.ac.kr)\*, Byoung-Soo Park(bpark@smu.ac.kr)\*\* #### 요약 System-On-a-Chip(SOC)을 테스트하는 동안에 요구되는 테스트 시간과 전력소모는 SOC내의 IP 코어의 개수가 증가함에 따라서 매우 중요하게 되었다. 본 논문에서는 수정된 스캔 래치 재배열을 사용하여 scan-in 전력소모와 테스트 테이터의 양을 줄일 수 있는 새로운 알고리즘을 제안한다. 스캔 벡터 내의 해밍거리를 최소화하도록 스캔 래치 재배열을 적용하였으며 스캔 래치 재배열을 하는 동안에 스캔 벡터 내에 존재하는 don't care 입력을 할당하여 저전력 및 테스트 테이터 압축을 하였으며 ISCAS 89 벤치마크 외호에 적용하여 모든 경우에 있어서 테스트 데이터를 압축하고 저전력 스캔 테스팅을 구현하였다. ■ 중심어: | SoC 테스트 | 저전력 | 스캔 테스트 | #### Abstract Testing time and power consumption during testing System-On-a-Chip (SOC) are becoming increasingly important as the IP core increases in a SOC. We present a new algorithm to reduce the scan-in power and test data volume using the modified scan latch reordering. We apply scan latch reordering technique for minimizing the hamming distance in scan vectors. Also, during scan latch reordering, the don't care inputs in scan vectors are assigned for low power and high compression. Experimental results for ISCAS 89 benchmark circuits show that reduced test data and low power scan testing can be achieved in all cases. ■ Keword: | SoC Test | Low Power | Scan Testing | #### 1. Introduction SOC typically consists of some user defined logic and Intellectual Property (IP) cores, such as memory and processor cores. This core-based design greatly increases the design productivity and speeds up the time to market, but on the other hand, it also turns testing into a more and more challenging task. One of the challenges in testing SOC is dealing 접수번호: #050112-001 심사완료일: 2005년 02월 04일 접수일자: 2005년 01월 12일 교신저자: 정준모, e-mail: jmjung@hycu.ac.kr with the large amount of test data that must be transferred between the tester and the chip. The Input/Output (I/0) channel capacity, speed and accuracy, and data memory of automatic test equipment(ATE) are limited. Therefore, It's becoming increasingly difficult to apply the enormous volume of test data to the SOC. Test data stored in the tester is given to the relevant IP core via demux in an SOC and test response must be observed by SOC output. Testing time is determined by data transmission speed and transmission width between the tester memory and SOC. If the test data volume of IP core increases, testing time will be long taken. One solution to this problem is to use built-in self-test (BIST) where on-chip hardware is used to test the cores. However for pre-designed logic cores, this is only practical if the core was originally designed to achieve high fault coverage with BIST. Since most IP cores that are currently available from core vendors are not "BISTable", considerable redesign is necessary for incorporating BIST. Another method for reducing the test data volume is based on the use of data compression techniques. The test set for an IP core is compressed to a much smaller test set, which is stored in ATE memory. The decoder on IP cores decodes the compressed data to obtain original test data during test application. The test set compression for non-scan circuits [1] and test set compression for full-scan circuits [2] was presented. The techniques in [1,2] were for sequential circuits with large numbers of flip-flops and relatively few primary inputs. The techniques in [3] exploited that successive test patterns in a test sequence often differ in only a small number of bits. Instead of compressing the original vector, "Difference Vector" sequence determined from original vector was compressed using a run-length coding. This method was variable-to-fixed length codes, which are less efficient than more general variable-to-variable –length codes [4,5]. Another method was presented in [6] by using Golomb codes that map variable-length runs of 0s in difference vector to variable-length codeword, which achieved greater compression. Also, the new class of variable-to-variable-length compression codes that are designed using the distributions of the runs of 0s in typical test sequence was presented [7], which was a frequency-directed run-length (FDR) code. It showed that the FDR codes outperform Golomb codes for test data compression. Power consumption is another problem in SOC testing since the power of digital system is considerably higher in test mode than in normal mode. The reason is that test patterns cause as many nodes switching as possible while normal mode only activates a few modules at the same time. Special care must be taken to ensure that the power consumption is not exceeded during test application. A new ATPG tool [9] was proposed to overcome the low correlation between consecutive test vectors during test application. The methods about low power mapping and test compression for unspecified scan vectors were proposed [6,10]. Also, a new compression and low power consumption technique using Scan Latch Reordering (SLR) was proposed [8,11]. It mapped the don't care input for low power and performed the SLR[8]. This paper presents efficient low power and higher compression methods for sequential circuits with full-scan circuits. In previous method, the don't care inputs in unspecified scan vectors was assigned to 0 (0 mapping) or adjacent input value for low power and higher compression, Minimum Transition Count mapping (MTC mapping). After mapping, the various techniques for compression and power were applied to mapped scan vectors. However, in this paper, power consumption during scan-in and test data volume can be effectively reduced by the method which assigns neighboring input value to don't care inputs during scan latch reordering. Also the new scan latch reordering is applied to minimize the hamming distance between adjacent scan cells. The organization of the paper is as follows. In section 2, we describe the power consumption model of Digital CMOS circuit and previous work. Section 3 presents a new scan latch reordering that considers low power mapping. Section 4 discusses the results of ISCAS89 benchmark circuit and section 5 provides the conclusion. # II. Power Consumption model and Previous Work ## power consumption model of digital CMOS circuit Power dissipation in CMOS circuits can be classified into static, short circuit, leakage and dynamic power dissipation. The static power is negligible. The short circuit power dissipation caused by short circuit current and power dissipated by leakage currents contribute up to 20% of the total power dissipation. The remaining 80% is attributed to dynamic power dissipation caused by switching of the gate outputs. If the gate is part of a synchronous digital circuit controlled by global clock, it follows that the dynamic power $P_d$ required to charge and discharge the output capacitance load of every gate is: $$P_{d} = 0.5C_{load}V_{dd}^{2}F_{c}N_{g} \tag{1}$$ where $C_{load}$ is the load capacitance, $V_{dd}$ is the supply voltage, $F_c$ is the global clock frequency, and $N_g$ is the total number of gate output transitions (0->1 or 1->0). These transitions are major factor of power dissipation. The power dissipation during full-scan testing is due to the dynamic power by transitions occurred when the scan vectors are shifted in the scan chain. #### 2. scan-in power model Power consumption in testing a sequential circuit with a single scan chain includes; a scan-in power consumed during the scan-in operations of scan vectors, scan-out power consumed during the scan-out operations of test response and a power consumed in combinational logic of the sequential circuit. It is difficult to estimate the scan-out power directly from the scan vector set since the test response must be determined from the function of the core under test. Therefore, as in [12], we consider the scan-in power only and measure it in terms of Weighted Transition Metric (WTM). The scan-in power depends not only on the number of transitions in it but also on their relative positions. For example, consider the scan vector $S_1S_2S_3S_4S_5=10101$ with scan length 5. If the left most bit $(S_1)$ is first shifted in the scan chain, the transition of $(S_1, S_2)$ causes four (scan length -1) transitions during scan-in. The transition of $(S_2, S_3)$ causes three (scan length -2) transitions. Therefore, the transition of $(S_3, S_4)$ causes (scan length -J) transitions. Let each scan vector SV with scan length k be $S_1S_2, \ldots, S_k$ . The scan-in power for SV is given by $$P_{sv} = WTM(SV) = \sum_{j=1}^{k-1} (S_j \oplus S_{j+1})(k-j) \quad (2)$$ The term, $S_j \oplus S_{j+1}$ is 1 if the transition occurs between $S_j$ and $S_{j+1}$ . Therefore, transition occurred at jth bit position cause transitions as many as K-j during scan-in to scan chain. Also, assuming that the set of scan vector used for testing is $SV_{set} = SV_1, SV_2, ., SV_n$ , the power consumed during scan-in of $SV_{set}$ is; $$WTM(SV_{set}) = \sum_{i=1}^{n} (SV_i)$$ (3) Therefore, average power consumption and peak power consumption can be represented such as (4). $$P_{avg} = WTM(SV_{set})/n$$ $$P_{peak} = Max(WTM(SV_i))$$ for all scan vectors $$(4)$$ #### 3. previous work For unspecified scan vectors, the don't care input in scan vectors must be determined. The logic value 0 was assigned to don't care input in scan vectors for higher compression and the scan vectors with 0 mapped was compressed by Golomb codes based on 0's run [6]. The low power mapping method for reducing scan-in power was proposed. The don't care input was mapped to adjacent input value for reducing the transitions [10]. But the compression ratio fell remarkably because of encoding with only 0's run. A low power mapping and scan latch reordering proposed in [11] was the method to reorder the position of the scan vector inputs. In [8], it performed efficient power consumption and compression by applying scan latch reordering again for scan vectors mapped into low power. This paper presents new algorithm which do not carries out scan latch reordering to scan vector mapped into low power, but applies low power mapping, while performing scan latch reordering simultaneously. A new method is also proposed to perform reordering for scan latch using the hamming distance as the cost function. # III, Scan Latch Reordering considering low power mapping As explained in previous section, the proposed methods till now applied low power and compressing method that first assign binary logic value to don't care input within each scan vector. However, this paper proposes a new method, not to assign value first to don't care input but to assign the value to don't care input while applying scan latch reordering. Scan latch reordering for low power is to carry out reordering the position of input of scan vector. Therefore, it is possible to get low power and high compression ratio if we assign value instead of assinging value in advance to don't care input in order ake transition small while carrying out scan latch reordering. #### 1. The cost function for scan latch reordering Let's denote $SV_{set}$ be SV[r][c] of two dimensional array, whose number of scan vector is r and input number of scan vector is c. Each element of array, SV[i][j] means the jthscan input of the ith scan vector. The $SV_{set}$ is described as (5). $$WTM(SV_{set})$$ $$= \sum_{i=1}^{r} \sum_{j=1}^{c-1} (SV[i][j] \oplus SV[i][j+1])(c-j)$$ $$= \sum_{j=1}^{c-1} \sum_{i=1}^{r} (SV[i][j] \oplus SV[i][j+1])(c-j)$$ $$= \sum_{j=1}^{c-1} HD_{col}(j,j+1)(C-j)$$ (5) In (5), we can define the $HD\_col(j,j+1)$ as the hamming distance between jth column and between (j+1)th column for all rows. For example, assuming the $SV_{set}$ as Fig. 1. The $SV_{set}$ has a 4 scan vectors with three bits (scan chain length=3). $$\mathcal{S}V_{\mathit{set}} = \begin{bmatrix} 010\\010\\101\\110 \end{bmatrix}$$ Figure 1. Example for HD\_col In this example, the column hamming distances are as follows: $HD_col(1,2) = 3$ , $HD_col(1,3)=1$ and $HD_col(2,3) = 4$ . In order for $WTM(SV_{sel})$ to be small, the HD\_col must be small. The scan latch reordering means the reordering of column position in scan vectors. Therefore, it will be good to use the column hamming distance as the cost function of scan latch reordering. #### 2. Proposed Algorithm #### 2.1 Calculation of Hamming Distance HD\_col(j,j+1) is the Hamming Distance of between column j and j+1, and calculate it considering don't care input. Given $SV[i][j] \in \{0,1,X(don't \text{ care input})\}$ , the hamming distance between SV[i][j] and SV[i][j+1] is 1 only in the condition that (SV[i][j], SV[i][j+1]) is (0,1) or (1,0). In case of (0,X) and (1,X), the hamming distance becomes 0 because the value of SV[i][j] can be assigned to X existing in SV[i][j+1]. - Define SV[\*][j] as jth column in two dimensional array of scan vectors. - Initialize arbitrary SV[\*][j] to SV[\*][1]. And assign 0 to all X within SV[\*][1]. - Choose minimized j after calculating Hamming Distance about SV[\*][1] and all SV[\*][j]. And exchange SV[\*][2] and SV[\*][j]. The transition of SV[\*][1] and SV[\*][2] becomes the minimum. - By assigning the value of SV[\*][2] existing in the same row to X within SV[\*][1], prevent the transition from occurring.Because scan latch reordering is a Np-hard problem, it is very difficult to find near optimal solution. We propose Heuristic algorithm below. In this way, carry out reordering about all SV[\*][j] to make Hamming Distance the minimum. The proposed algorithm can be described by pseudo code in Figure 2. ``` SV_set = SV[r][c]; /* two dimensional representation */ Initialize SV[*][1]: /* Initialization of SV[*][1] to arbitrary SV[i][j]. */ For(j=1; j < c; j++) for(k=j+1; k< c+1;k++) HD_col(j,k); /* Calculate HD_col */ Search the index k with minimum HD_col; Exchange the column i+1 with column k; } HD col(i,k) HD_sum(k) = 0; for( i =1; i<r+1; i++) if((SV[i][j] == 0 && SV[i][k] == 1) SV[i][j] == 1 && SV[i][k] == 0)) HD_sum(k)++; } ``` Figure 2. Proposed Algorithm #### IV. Experimental Results In this section, we evaluate the effect of the proposed method on test data volume and power consumption during scan testing for ISCAS89 benchmark circuits. The experiments were conducted on a Sum Ultra 10 workstation. We considered full-scan sequential circuits. For each full-scan circuit, we assumed a single scan chain for our experiments. We used partially-specified scan vector sets generated by MINTEST Automatic Test Pattern Generation (ATPG) program with dynamic compaction [13]. Table 1 presents the experimental results on scan vector compression. In the column compression ratios, sub column Golomb coding shows the compression ratios by golomb encoding after mapping only 0 to don't care inputs [6]. The MTC Map & SLR is the proposed method in [8]. The amount of compression ratios obtained was computed as follows: Table 1. Experimental Results on Test Data Compression | | Compression Ratios (%) | | | | | |----------|---------------------------------|------------------|----------|--|--| | Circuits | Golomb<br>coding<br>(0 Mapping) | MTC Map &<br>SLR | Proposed | | | | S5378 | 37.13 | 46.10 | 57.02 | | | | S9234 | 45.27 | 47.20 | 63.24 | | | | S13207 | 79.75 | 81.07 | 90.10 | | | | S15850 | 62.83 | 64.59 | 81.12 | | | | S38417 | 28.38 | 58.56 | 81.21 | | | | S38584 | 57.17 | 63.41 | 76.32 | | | | Average | 51.76 | 60.16 | 74.83 | | | As shown table 1, the compression ratio in golomb coding in case of s5378 is about 37.13%, about 46% in MTC&SLR and about 57% in the proposed method. In average, the compression ratio is about 50.67% of compression ratio in Golomb, about 59.6% in MTC&SLR and 57 % in proposed method. Our method shows better compression ratio than that of Golomb by about 23% and about 15% than that of MTC Map & SLR. Table 2 shows the reduction ratios for peak power. The reduction ratio in our method is reduced to 47% less than 0 Mapping, 13% less than MTC Map &SLR in case of s5378. Also, the average reduction ratio is about 21% in 0 Mapping, about 66% in MTC &SLR and about 81.68% in our method. The experimental results show that the proposed method has a better reduction ratio than previous method. Table 2. Reduction Ratios for Peak Power | A S. C. TOURISTON FOR TOWER | | | | | | | | |-----------------------------|-----------|-------|------------------|-------|----------|-------|--| | Circuit | 0 Mapping | | MTC Map &<br>SLR | | Proposed | | | | | Peak | % | Peak | % | Peak | % | | | S5378 | 10127 | 24.55 | 5556 | 58.61 | 3849 | 71.3 | | | \$9234 | 12994 | 25.72 | 7400 | 57.70 | 5331 | 69.5 | | | S13207 | 101127 | 25.42 | 35486 | 73.83 | 12585 | 90.7 | | | S15850 | 81832 | 18.35 | 33207 | 66.87 | 13479 | 86.6 | | | S38417 | 505295 | 26.10 | 181436 | 73.46 | 54034 | 92.1 | | | S38584 | 531321 | 7.21 | 187379 | 67.28 | 111466 | 80.1 | | | Average | | 21.23 | | 66.29 | | 81.68 | | In table 3, we report the results of the reduction ratio for average power. Table 3. Reduction Ratios for Average Power | Circuit | 0 Mapping | | MTC Map &<br>SLR | | Proposed | | |---------|-----------|-------|------------------|-------|----------|-------| | | Avg | % | Avg | % | Avg | % | | S5378 | 3336 | 68.89 | 2435 | 82.89 | 860 | 92.2 | | S9234 | 5692 | 61.09 | 3466 | 82.81 | 1021 | 93.0 | | S13207 | 12416 | 89.82 | 7703 | 95.75 | 766 | 99.3 | | S15850 | 20742 | 77.18 | 13381 | 90.16 | 2162 | 97.7 | | S38417 | 172665 | 71.31 | 112198 | 85.72 | 15960 | 97.3 | | S38584 | 136634 | 74.50 | 88298 | 90.77 | 18235 | 96.6 | | Average | | 73.79 | | 88.02 | | 96.02 | While it is about 89.8% in 0 Mapping, 96% in MTC&SLR in the case of s13207, the proposed method gives a good result of 99.3%. About power reduction rate, in average, showed are about 74% in the case of 0 Mapping, 88% in MTC&SLR and about 96% in the proposed method. As showed in above experimental results, the proposed method has higher compression ratios and power reduction ratios. #### V. Conclusion As the number of the IP core increases in a SOC, test data volume and testing time increase, too. As a result, the testing and chip costs goes up and productivity goes down. Also, power consumption in test mode is much larger than that in normal mode and causes damage on chip due to excessive power consumption. This paper proposes a new algorithm that has efficient low power and high compression ratio for unspecified scan vectors. It applies mapping don't care input at the same time of performing scan latch reordering using the hamming distance as the cost function. The proposed method shows extremely high compression and power reduction as compared to the previous method. ### 참고문헌 - [1] V.Iyengar, K.Charabarty, and B.T.Murray, "Built-in self testing of sequential circuits using precomputed test sets," Proc. IEEE VLSI Test Symp., pp. 418-423, 1998. - [2] V. Iyengar, K. Charabarty, and B.T.Murray, "Deterministic built-in pattern generation for sequential circuits," J.Electron. Tech. Theory Applicat., Vol.15, pp. 97-115, 1999. - [3] A. Jas, N. A. Touba, "Test vector decompression via cyclical scan chains and its application to tesing core-based design," Proc. Int. Test Conf., pp. 458-464, 1998. - [4] S. W. Golomb, "Run-Length encoding," IEEE Transactions on Information Theory, Vol. IT-12, pp. 399-401, 1966. - [5] H. Kobayashi and L.R. Bahl, "Image data compression by predictive coding, Part I: Prediction Algorithm," IBM Journal of Research & Development, Vol.18, pp. 164, 1974. - [6] A. Chandra and K. Chakrabarty, "System on-a-chip test data compression and decompression architectures based on Golomb codes," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol.20, No.3, pp. 355-368, 2001 - [7] A. Chandra and K. Chakrabarty, "Frequency -directed run-length (FDR) codes with application to system-on-a-chip test data compression," Proc. IEEE VLSI Test Symposium, pp. 42-47, 2001. - [8] P. Rosinger, P.T. Gonciari, B.M. Al-Hashimi and N. Nicolici, "Simultaneously reduction in volume of test data and power dissipation for systems-on-a-chip," Electronics Letters, Vol.37, No.24, pp. 1434– 1436, 2001. - [9] S. Wang and S.K. Gupta, "ATPG for heat dissipation minimization during test application," IEEE Transactions on Computers, pp. 256–262, 1998. - [10] A. Chandra and K. Chakrabarty, "Combining low-power scan testing and test data compression for system-on-a-chip," Proc. IEEE/ACM Design Automation Conference (DAC), pp. 166-169, 2001. - [11] V. Dabhokar, S. Chakravarty, I. Pomeranz and S.M. Reddy, "Techniques for minimizing power dissipation in scan and combinational circuits during test application," IEEE Trans. Comput.-Aided Des., pp. 1325-1333, 1998. - [12] R. Sankaralingam, R.R. Oruganti and N.A. Touba, "Static compaction techniques to control scan vector power dissipation," Proc. VTS, pp. 35-40, 2000. - [13] I. Hamzaoglu and J.H. Patel, "Test set compaction algorithms for combinational circuits," Proc. Int. Conf. Computer-Aided Design, pp. 283-289, 1998. #### 저지소개 #### 정 준 모(Jun-Mo Jung) #### 정회원 - 1987년 : 한양대학교 전자공학 과(공학사) - 1989년 : 한양대학교 전자공학 과(공학석사) - 2004년 : 한양대학교 전자공학과(공학박사) - 1989년~1996년 : 삼성전자 ASIC 센터 - 1996년~ 2004년 : 김포대학 전자정보계열 조교수 - 2004년~현재: 한양사이버 대학교 컴퓨터학과 조교수 관심분야>: VLSI 설계, SoC Test, Low Power Design #### 박 병 수(Byoung-Soo Park) #### 정회원 - 1986년 : 한양대학교 전자공학 과(공학사) - 1989년 : 한양대학교 전자공학 과(공학석사) - 1994년 : 텍사스 A&M(공학박 사) - 1995년~현재 : 상명대학교 컴퓨터시스템공학과 교수 <관심분야> : 임베디드 시스템, 최적화 알고리즘