# A SSN-Reduced 5Gb/s Parallel Transmitter Seon-Kyoo Lee, Young-Sang Kim, Hong-June Park, and Jae-Yoon Sim Abstract—A current-balancing segmented group-inverting transmitter is presented for multi-Gb/s single-ended parallel links. With an additional increase of 4 pins, 16-bit data is efficiently encoded to 20 pins to achieve the current balancing and eliminate the simultaneous switching noise. Since the proposed coding is a simple inversion-or-not transformation of pre-defined groups of binary data, it can be implemented with simplified logic circuits. The transmitter is designed with a 0.18µm CMOS technology, and simulated eye diagrams at 5Gb/s show dramatic improvements in signal integrity. Index Terms—Simultaneous switching noise, inversion coding, parallel link, memory interface #### I. Introduction Performance of a digital system is determined by the data rate of inter-chip communication as well as on-chip operating speed. The rapid increase in on-chip operating frequency has driven extensive research on circuit and packaging solutions for high-speed off-chip interface. Serial links have achieved data rates of over 10 Gb/s using differential signaling through a well-defined channel. For parallel links, however, single-ended signaling is still essential for the low-cost PCB solutions. To achieve higher throughput in parallel links, widespread use of parallelism has been adopted to alleviate circuit complexity and tightened timing constraints. The demand for the increase in parallelism, however, presents some challenges which must be overcome. One of the most serious factors limiting the performance of single-ended parallel links such as memory interface is simultaneous switching noise[1-3] on the internal power for output drivers as shown in Fig. 1. Since the allowable number of power pins is limited, the net inductance cannot be reduced sufficiently. To reduce the simultaneous switching noise, bus inversion coding schemes have been proposed and analyzed[4-8]. In the bus inversion coding schemes, one extra pin is allocated for a flag to indicate the status of the inversion. If the number of bit transition is more than half, transmitter inverts all the parallel data with the flag set to 1. So the number of bit transition is kept to be less than half, hence reduction of the simultaneous switching noise. As parallelism increases for higher performance, however, the reduction of the switching noise by half is not sufficient. Therefore performance of parallel link can Fig. 1. Conventional parallel link transceiver system. E-mail: leesk@postech.ac.kr Manuscript received Nov. 3, 2007; revised Nov. 30, 2007. Department of Electrical Engineering, Pohang University of Science and Technology not be further improved even with the inversion coding schemes. In gigabit parallel links, parallel termination is used[9,10] and the total driving current is determined by the number of ZEROs and ONEs not by the number of bit transitions. So the switching noise can be reduced more efficiently by current balancing rather than by the reduction of the number of transitions. Recently, a segmented group-inversion coding[11] was proposed to achieve current-balancing with the minimal increase in number of pins. The difference between the number of ZEROs and ONEs after encoding is only 0 or 2. Since the encoding is a group-based inversion-or-not transformation, it can be implemented by simple logic circuits. This paper presents a 16-bit transmitter design example based on the segmented group-inversion coding[11]. Section II describes the coding. Circuit implementation is shown in Section III. Section IV shows simulation results with the designed transmitter, and Section V concludes this work. #### II. SEGMENTED GROUP-INVERSION CODING Fig. 2 shows the segmented group-inversion transmission system for 16bit-to-20bit encoding. 16 bits are partitioned into five groups, $G_1$ , $G_2$ , $G_3$ , $G_4$ , and $G_5$ , with the number of data bits of 2,2,4,4, and 4, respectively. Except for $G_5$ , a flag bit is inserted in each group. $f_1$ , $f_2$ , $f_3$ , and $f_4$ are the flag bits for $G_1$ , $G_2$ , $G_3$ , and $G_4$ , respectively. All the flag bits are initialized to 0. The encoding is performed in a group-based procedure, which is a simple inversion-or-not of whole bits in the group. A flag indicates the inversion status of Fig. 2. Proposed parallel link transceiver system. the group. The encoding is performed in a step-by-step procedure from $G_5$ down to $G_I$ . - 1) $G_5$ : Transmit original data without encoding. - 2) $G_4$ : Apply the inversion-or-not encoding so that the accumulated disparity with $G_5$ should not be larger than the other case of the encoding. If $G_5$ has excess ZEROs, $G_4$ is encoded to contain excess ONEs, and vice versa. - 3) Apply the inversion encoding for the other groups from $G_3$ down to $G_1$ according to the rule described in step 2. Fig. 3 shows all the cases of the encoding procedure. To represent data in terms of disparity, a notation (a, b) is | <b>G</b> 5 | Before<br>Encoding<br>G4 | | After<br>Encoding<br>G4 | | accumulated<br>difference | G5,G4 | Before<br>Encoding<br>G3 | | After<br>Encoding<br>G3 | | accumulated<br>difference | |----------------|--------------------------------------|----|--------------------------------------|-----------------------|------------------------------------------------|--------|--------------------------------------|----|--------------------------------------|-----------------------|------------------------------------------------| | | n4 | f4 | n4 | f4 | (5 cases) | | пз | fз | пз | f3 | (5 cases) | | 0000 (0, 4) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 1111<br>1110<br>1100<br>0111<br>1111 | 1<br>1<br>1<br>0 | (1, 1)<br>(0, 1)<br>(0, 3)<br>(0, 3)<br>(0, 1) | (0, 5) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 1111<br>1110<br>1100<br>0111<br>1111 | 1<br>1<br>1<br>0 | (0, 0)<br>(0, 2)<br>(0, 4)<br>(0, 4)<br>(0, 2) | | 0001<br>(0, 2) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 1111<br>1110<br>1100<br>0111<br>1111 | 1<br>1<br>1<br>0 | (1, 3)<br>(1, 1)<br>(0, 1)<br>(0, 1)<br>(1, 1) | (0, 3) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 1111<br>1110<br>1100<br>0111<br>1111 | 1<br>1<br>1<br>0<br>0 | (1, 2)<br>(0, 0)<br>(0, 2)<br>(0, 2)<br>(0, 0) | | 0011<br>(0, 0) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 0000<br>0001<br>0011<br>0111<br>1111 | 0<br>0<br>0<br>0 | (0, 5)<br>(0, 3)<br>(0, 1)<br>(1, 1)<br>(1, 3) | (0, 1) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 1111<br>1110<br>1100<br>0111<br>1111 | 1<br>1<br>1<br>0<br>0 | (1, 4)<br>(1, 2)<br>(0, 0)<br>(0, 0)<br>(1, 2) | | 0111<br>(1, 2) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 0000<br>0001<br>0011<br>1000<br>0000 | 0<br>0<br>0<br>1<br>1 | (0, 3)<br>(0, 1)<br>(1, 1)<br>(1, 1)<br>(0, 1) | (1, 1) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 0000<br>0001<br>0011<br>1000<br>0000 | 0<br>0<br>0<br>1 | (0, 4)<br>(0, 2)<br>(0, 0)<br>(0, 0)<br>(0, 2) | | 1111<br>(1, 4) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 0000<br>0001<br>0011<br>1000<br>0000 | 0<br>0<br>1<br>1 | (0, 1)<br>(1, 1)<br>(1, 3)<br>(1, 3)<br>(1, 1) | (1, 3) | 0000<br>0001<br>0011<br>0111<br>1111 | 0 | 0000<br>0001<br>0011<br>1000<br>0000 | 0<br>0<br>0<br>1 | (0, 2)<br>(0, 0)<br>(1, 2)<br>(1, 2)<br>(0, 0) | | (a) | | | | | | (b) | | | | | | | G5 - G3 | Before<br>Encoding<br>G2<br>n2 f2 | | After<br>Encoding<br>G2<br>n2 f2 | | accumulated<br>difference<br>(4 cases) | G5 - G2 | Befo<br>Enco | | | | accumulated<br>difference<br>(3 cases) | |---------|-----------------------------------|-----|----------------------------------|-------------|----------------------------------------|---------|----------------|---|----------------|-------------|----------------------------------------| | (0, 4) | 00<br>01<br>11 | 0 | 11<br>10<br>11 | 1<br>1<br>0 | (0, 1)<br>(0, 3)<br>(0, 3) | (0, 3) | 00<br>01<br>11 | 0 | 11<br>10<br>11 | 1<br>1<br>0 | (0, 0)<br>(0, 2)<br>(0, 2) | | (0, 2) | 00<br>01<br>11 | 0 | 11<br>10<br>11 | 1<br>1<br>0 | (1, 1)<br>(0, 1)<br>(0, 1) | (0, 1) | 00<br>01<br>11 | 0 | 11<br>10<br>11 | 1<br>1<br>0 | (1, 2)<br>(0, 0)<br>(0, 0) | | (0, 0) | 00<br>01<br>11 | 0 | 00<br>01<br>11 | 0<br>0<br>0 | (0, 3)<br>(0, 1)<br>(1, 1) | (1, 1) | 00<br>01<br>11 | 0 | 00<br>01<br>00 | 0<br>0<br>1 | (0, 2)<br>(0, 0)<br>(0, 0) | | (1, 2) | 00<br>01<br>11 | 0 | 00<br>01<br>00 | 0<br>0<br>1 | (0, 1)<br>(1, 1)<br>(1, 1) | (1, 3) | 00<br>01<br>11 | 0 | 00<br>01<br>00 | 0<br>0<br>1 | (0, 0)<br>(1, 2)<br>(1, 2) | | (1, 4) | 00<br>01<br>11 | 0 | 00<br>01<br>00 | 0<br>0<br>1 | (1, 1)<br>(1, 3)<br>(1, 3) | | | | | | ••••• | | | | (d) | | | | | | | | | | **Fig. 3.** Encoding procedure; Encoding of $G_4(a)$ , $G_3(b)$ , $G_2(c)$ , $G_1(d)$ . defined. a denotes the majority bit, and b is the number that the majority bit exceeds the other by. For example, (0, 3) represents there are three excess ZEROs. Since only the disparity is of interest for the decision of the inversion-or-not, the order of ZEROs and ONEs is neglected. Encoding is performed as the order shown in Fig. 3(a), (b), (c), and (d). The completion of the encoding reduces the difference between the number of ZEROs and ONEs to only 0 or 2 as shown in Fig. 3(d). Decoding is obvious and can be implemented with only one XOR gate for each bit. Fig. 4 shows the block diagram of the encoder for the 16-bit case described above. $D_{15}$ – $D_0$ are 16-bit input data and $E_{I5}$ – $E_{\theta}$ are encoded outputs with four flag bits, $f_I$ – $f_4$ . Classification blocks( $C_1$ - $C_5$ ) compute the disparity in each block before the actual encoding. In $C_1$ – $C_4$ , initial value of the flag(=0) is counted in the disparity computation. As shown in Fig. 3(a), $G_5$ can be classified to one of the five cases of even disparities, $\{(0, 4), (0, 2), (0, 2), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4), (0, 4$ (0, 0), (1, 2), (1, 4). $G_3$ and $G_4$ are initially classified to one of the five cases of odd disparities, $\{(0, 5), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), ($ 1), (1, 1), (1, 3)}, before encoding. As 5 is less than $8(=2^3)$ , the disparity information of $G_3$ , $G_4$ , and $G_5$ can be translated to a 3-bit binary word for simpler circuit implementation. For $G_1$ and $G_2$ , since they belong to one of the three cases of odd disparities, {(0, 3), (0, 1), (1, 1)}, a 2-bit translation can be used. One convenient translation used in this work is to assign MSB to indicate the majority bit, i.e. 0 for the excess ZEROs and 1 for the excess ONEs. The other bits are encoded to assign the detailed number of the difference. Fig. 4. Block diagram of encoder. $F_1$ – $F_4$ are flag-computing blocks. The flag is set to 1 only if the majority bits of two inputs are different. $F_2$ – $F_4$ also accumulate disparity information as shown in Fig. 3. The encoding block(E) simply consists of only XOR gates and DFFs to practically perform the inversion according to the flag. Since the proposed coding concerns only the difference without consideration of run length, it greatly simplifies the circuit implementation compared with similar conventional 8B10B coding schemes[12] generally used in serial links for the time-domain DC-balancing. ## III. CIRCUIT DESCRIPTION A transmitter is designed with a 0.18µm CMOS technology. Fig. 5 shows the overall circuit diagram of the transmitter. A 4-way time-interleaved architecture is adopted for high-speed transmission. 4-to-1 serializing multiplexers provide inputs to the output buffers. An open-drain type current-mode output driver is used with an on-chip termination resistor for parallel termination. Voltage swing is designed to be 0.5V on 50 $\Omega$ -terminated line, so one output driver consumes 20mA when the output is driven low. Although the final difference after the encoding is 0 or 2, ideal balancing in the pull-down current can be Fig. 5. Circuit diagram of transmitter. achieved by flowing additional dummy current with two sets of the replica multiplexer and output driver. Then the number of ONEs in the 22 outputs of multiplexers is always 11. Since the outputs of 22 multiplexers are pseudo differential, node A becomes a virtual ground when used with a shared pull-down current source of 11×20mA. The voltage swing on the input of the output driver does not have to be rail-to-rail as the virtual ground level is automatically set so that sufficient switching would be performed even with the reduced input swing, and it is additionally advantageous for high-speed operation. The shared pull-down current scheme is also used for the outputs of 22 multiplexers. #### IV. SIMULATION RESULTS To model a worse noisy environment, simulations were performed assuming that all the 20 output drivers share only two VDD pins and two VSS pins. Then the net inductance of the power line was 3nH. For the data pin parasitics, a lumped model of 2.5pF and 6nH loading was used. Fig. 6 shows simulated eye diagrams on the far-end node of a 10-cm transmission line at 5 Gb/s for the two cases of the without and with encoding. As shown in the figure, the encoded transmitter dramatically improves signal integrity. The maximum data rate of 5Gb/s is due to the speed limitation of digital circuits. Fig. 6. Simulated eye diagram. (a) conventional, (b) proposed Fig. 7 shows the simulated voltage and current transients of the internal power for output drivers at a data rate of 5 Gb/s. The proposed coding scheme regulates the driving Fig. 7. Simulated voltage and current transients at 5 Gb/s. (a) internal power for output drivers (b) driving current Fig. 8. Summary of horizontal. (a) and vertical (b) eye openings for the different cases of data rate current by the current balancing and reduces the fluctuations of the internal supply voltage to be negligible. The conventional binary signaling, however, suffers from fluctuations of as mush as $\pm$ 200mV with the strong dependency of data pattern. Fig. 8. shows the summary of simulated eye openings for the different cases of the data rate. The proposed transmitter greatly improves the signal integrity as the data rate increases due to the elimination of the simultaneous switching noise. At the supply voltage of 1.8V, excluding PRBS, the total power consumption is 880mW at 5Gbps, or 55mW/channel when translated equivalently to 16 channels. Encoder consumed approximately 30-percent of the total power consumption. #### V. Conclusions A 5 Gb/s 16-to-20 bit encoded transmitter was designed in a 0.18µm CMOS. To minimize the simultaneous switching noise, an efficient current-balancing segmented group-inversion coding is adopted. With additional increase of 4 pins, the proposed coding limits the difference between the number of ZEROs and ONEs to only 0 or 2. Since the encoding is a group-based inversion-or-not transformation, it can be implemented by simple logic circuits. Simulation shows dramatic improvements in signal integrity, and the proposed coding scheme is suitable for low-cost gigabit parallel links such as memory or processor-to-processor interface. #### ACKNOWLEDGMENTS This work was supported by the IDEC, BK21, and IT-SoC program. ### REFERENCES - [1] S. Jou, S. Kuo, J. Chiu, and T. Lin, "Low switching noise and load-adaptive output buffer design techniques," *IEEE J. Solid-State Circuits*,, vol. 36, pp. 1239-1249, Aug. 2001. - [2] R. Senthinathan and J. L. Prince, "Application specific CMOS output driver circuit design techniques to reduce simultaneous switching - noise," *IEEE J. Solid-State Circuits*,, vol. 28, pp. 1383-1388, Dec. 1993. - [3] C. S. Choy, M. H. Ku, and C. F. Chan, "A low power-noise output driver with an adaptive characteristic applicable to a wide range of loading conditions," *IEEE J. Solid-State Circuits*, vol. 32, pp. 913-917, Jun. 1997. - [4] M. Stan, and W. Burleson, "Bus-invert coding for low-power I/O," *IEEE Trans. VLSI Systems*, vol. 3, pp. 49-58, Mar. 1995. - [5] M. Stan, and W. Burleson, "Low-power encodings for global communication in cmos VLSI," *IEEE Trans. VLSI Syst*, vol. 5, pp. 49-58, Dec. 1997. - [6] C. Chen, and B. Curran, "Switching Codes for Delta-I Noise Reduction," *IEEE Trans. on Computers*, vol. 45, pp. 1017-1021, Sep. 1996. - [7] P. Heydari, and M. Pedram, "Ground Bounce in Digital VLSI Circuits," *IEEE Trans. on VLSI Systems*, vol. 11, pp. 180-193, Apr. 2003. - [8] A.Nieuwland, A. Katoch, D. Rossi, and C. Metra, "Coding Techniques for Low Switching Noise in Fault Tolerant Busses," *IEEE International On-Line Testing Symp.*, pp. 183-189, 2005. - [9] B. Lau, Y. Chan, A. Moncayo, J. Ho, M. Allen, J. Salmon, J. Liu, M. Muthal, C. Lee, T. Nguyen, B. Horine, M. Leddige, K. Huang, J. Wei, L. Yu, R. Tarver, Y. Hsia, R. Vu, E. Tsern, H. Liaw, J. Hudson, D. Nguyen, K. Donnelly, and R. Crisp, "A 2.6-Gbyte/s multipurpose chip-to-chip interface," *IEEE J. Solid-State Circuits*, vol. 33, pp. 1617-1626, Nov. 1998. - [10] T. Sato, Y. Nishio, T. Sugano, and Y. Nakagome, "5Gbyte/s data transfer scheme with bit-to-bit skew control for synchronous DRAM," Symposium on VLSI Cir. Dig. Tech Papers, pp. 64-65, 1998. - [11] J. Y. Sim, "Segmented group-inversoin coding for parallel links," *IEEE Trans. on Circuits and Systems-II*, vol. 54, pp. 328-332, Apr. 2007. - [12] A. Widmer, and P. Franaszek, "A dc-balanced, partitioned-block, 8B/10B transmission code," *IBM J. Res. And Develop.*, vol. 27, pp. 440-451, Sep. 1983. Seon-Kyoo Lee received the B.S. degree in Electronic and Electrical Engineering from Hanyang University, Seoul, Korea, in 2006. He is currently pursuing the Ph.D. degree in Electronic and Electrical Engineering from Pohang University of Science and Technology (POSTECH), Korea. His interests include high-speed links, PLL/DLL circuits, data converters, and low-power analog circuits. Young-Sang Kim received the B.S. degrees in Electronic Engineering from Sogang University, Seoul, Korea, in 2005 and he currently is pursuing the Ph.D. degree in Electronic and Electrical Engineering from Pohang University of Science and Technology (POSTECH), Kyungbuk, Korea. His research interests include PLL/DLL, high-speed links and ultra low-power analog. Hong-June Park received the B.S. degree from the Department of Electronic Engineering, Seoul National University, Seoul, Korea, in 1979, the M.S. degree from the Korea Advanced Institute of Science and Technology, Taejon, in 1981, and the Ph.D. degree from the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, in 1989. He was a CAD engineer with ETRI, Korea, from 1981 to 1984 and a Senior Engineer in the TCAD Department of Intel from 1989 to 1991. In 1991, he joined the Faculty of Electronic and Electrical Engineering, Pohang University of Science and Technology (POSTECH), Kyungbuk, Korea, where he is currently Professor. His research interests include high-speed CMOS interface circuit design, signal integrity, device and interconnect modeling. Prof. Park is a member of IEEK, IEEE and IEICE. Jae-Yoon Sim received the B.S., M.S., and Ph.D. degrees in Electronic and Electrical Engineering from Pohang University of Science and Technology, Korea, in 1993, 1995, and 1999, respectively. From 1999 to 2005, he was a Senior Engineer at Samsung Electronics, Korea. From 2003, to 2005, he was a post-doctoral student with the University of Southern California, Los Angeles. In 2005, he joined the Faculty of Electronic and Electrical Engineering, Pohang University of Science and Technology (POSTECH), Korea, where he is currently an Assistant Professor. His research interests include PLL/DLL, high-speed links, memory circuits, data converters and ultra low-power analog.