논문 2006-43SD-10-4 # 레퍼런스 클록이 없는 3.125Gbps 4X 오버샘플링 클록/데이터 복원 회로 (3.125Gbps Reference-less Clock/Data Recovery using 4X Oversampling) 이 성 섭\*, 강 진 구\*\* (Sung-Sop Lee and Jin-Ku Kang) 요 약 본 는문은 시리얼 링크를 위한 레퍼런스 클록이 없고 4x 오버샘플링 방식의 위상 및 주파수 검출기 구조를 갖는 하프 레 이트 클록 및 데이터 복원 회로를 제안하였다. 위상 검출기는 4개의 업/다운 신호를 생성함으로써 위상 에러를 검출하고, 주파 수 검출기는 위상 검출기 출력에 의해 만들어진 업/다운 신호를 이용하여 주파수 에러를 검출한다. 그리고 위상 검출기와 주 파수 검출기의 여섯 개 신호는 전하 펌프로 흘러 들어가는 전류의 양을 조절한다. 네 개의 차동 버퍼로 구성된 VCO는 4x 오 배샘플링을 위한 8개의 클록을 생성한다. 0.18um CMOS 공정을 사용하였고, 실험 결과 제안된 회로는 3.125Gbps의 속도로 클 록과 데이터를 복원해 낼 수 있었다. 제안된 구조의 PD와 FD를 사용하여 24%의 넓은 트래킹 주파수 범위를 가진다. 측정된 클록의 지터(p-p)는 약 14ps였다. CDR은 1.8v의 단일 전원 공급기를 사용하였고, 전력소모는 약 140mW이다. ## Abstract An integrated 3.125Gbps clock and data recovery (CDR) circuit is presented. The circuit does not need a reference clock. It has a phase and frequency detector (PFD), which incorporates a bang-bang type 4X oversampling PD and a rotational frequency detector (FD). It also has a ring oscillator type VCO with four delay stages and three zero-offset charge pumps. With a proposed PD and FD, the tracking range of 24% can be achieved. Experimental results show that the circuit is capable of recovering clock and data at rates of 3.125Gbps with 0.18 um CMOS technology. The measured recovered clock jitter (p-p) is about 14ps. The CDR has 1.8volt single power supply. The power dissipation is about 140mW. Keywords: Clock and data recovery (CDR), frequency detector, Phase detector, 4X Oversampling, Charge Pump #### I. Introduction In recent years, the increase of data transmission over the internet has led to the demand for high- \*\* 정회원, 인하대학교 전자전기공학부 (School of Electronic and Electrical Engineering, Inha University ) 정회원, 하이닉스반도체 (Hynix Semiconductor) \* This work was supported by KOSEF. Authors also thank the IDEC and IT\_SOC program for its hardware and software assistance for the layout and simulation. 접수일자: 2005년11월11일, 수정완료일: 2006년9월15일 speed serial-data communication networks. Several optical communication standards have been applied to high-speed and long-distance communications. Considerable design efforts have been focused on low-cost. low-power integrated fiber-optic transmitters and receivers. CDR can be used for receivers to generate the clocks synchronized with received data<sup>[1,2,3,4]</sup>. One of challenges in designing CDR is phase-frequency detectors (PFDs) dealing with missing data transitions in NRZ data format. In convention, Hogge type phase detector(PD), Alesander type PD, tri-wave type PD, or quadricorrelator structures are used for CDR circuits. A three state PFD can not be used in CDR application because the data missing causes error pulses. In other CDR architectures, the oversampling algorithm was applied. These usually adopt 1:N demultiplexing architectures for relaxing the processing speed after very high oversampling the input [2,3]. This paper proposes a clock and data recovery circuit that has a half rate PD and FD using 4x-oversampling method without a reference clock<sup>[6]</sup>. The proposed circuit adopts a tracking PLL structure for phase and frequency locking. 3.125Gbps reference-less implemented with a half-rate bang-hang phase detector and a rotational frequency detector using 4x oversampling method. # II. Main subject ## 1. Architecture The proposed architecture is shown in Fig.1. Input data are sampled by 8 clock signals generated at the output of a four-stage differential VCO, then the phase and frequency detectors utilizes phase and frequency error information with 8 sampled data. Due to the small capture range of the phase detector, a frequency detector is added to aid frequency acquisition. The outputs of the phase and frequency detectors control three charge-pumps and loop filter to provide the control voltage for the VCO. And each charge pump has individually a different amount of current, respectively. Then, the VCO controlled by the control voltage at the loop filter generates 8 clock signals that have 45 degrees difference each other. 그림 1. 제안된 시스템의 블록도 Fig. 1. Block diagram of the proposed system. Finally, the MUX controlled with clk2 signal is used to produce the retimed data from Dout0 and Dout4 to obtain serial data from half-rate clock signal. #### 2. Phase Detector The block diagram of a half-rate bang-bang phase detector is shown in Fig. 2. The phase detector uses eight clock signals (clk0, clk0b~clk3, clk3b) to detect data transition for two consecutive incoming data. Eight samplers sample eight data for two consecutive input data. Through eight XOR gates, these samples provide four output signals that control two charge pumps (CP1, CP2) and loop filter to generate the VCO control voltage for a phase error. Fig. 3 shows the operation of PD up/down signals. In this condition, Pup1 and Pdn1 signals are generated in the case of a small phase error and Pup2 and Pdn2 signals are generated in the case of a large phase error. The number between clock signals represents the regions between different phases of clocks. If the PD is locked, the data transition occurs on the clk2 and clk2b signal. Using four steps with 4x oversampling instead of the conventional two steps phase adjustment increases the linearity of the bang-bang phase detector and results in both low jitter generation and high lock-in range. Fig. 4 shows the timing diagram of the four PD up/down signals. Since the phase detector samples consecutive two input data, the phase adjustment can be done through one clock period. A data transition between clk1 and clk2 (region II in Fig. 3) causes 그림 2. 위상검출기의 블록도 Fig. 2. Block diagram of the phase detector. 그림 3. 4X 오버샘플링 시 PD up/down 신호의 동작 Fig. 3. Operation of the PD up/down signal with 4X cversampling. 그림 4. PD up/down 신호의 타이밍도 Fig. 4. Timing diagram of the PD up/down signal. Pup1-2 signal to generate. In a similar manner, a data transition between clk1b and clk2b (region IV in Fig. 3) causes Pup1-1 signal to generate. Then Pup1-1 and Pup1-2 signals are multiplexed with clk1 signal to obtain a complete Pup1 signal. Pdn1, Pup2, and Pdn2 signal of the PD are also generated with the same manner. The hatched area illustrates the covered region by pull-up and pull-down control signal depending on data transition during one clock period. In reference [2], four PD outputs are generated using 5 different clock phases. In this paper, using 8 phases the phase step control becomes more refined. In addition to that, the lock can be achieved with coarse and fine phase control method by controlling four different regions (region I/V, region II/VI, region III/VII, and region IV/VIII as shown in Fig. 3). This enhances bandwidth of the PD, the locking range, and locking time. 그림 5. 주파수 검출기의 블록도 Fig. 5. Block diagram of the frequency detector. 그림 6. FD의 동작 Fig. 6. Operation of the FD. # 3. Frequency Detector In this paper, FD has the rotational frequency detection architecture. Fig. 5 shows the block diagram of the frequency detector. The FD uses Pup1, Pup2, and Pdn2 signals from the PD to detect frequency error. And the FD is disenabled by Pup1 signal when the frequency of the CDR is locked. Fig. 6 shows the operation of the FD. Because the FD uses the rotational method, if the consecutive data transitions occur from I to IV or from V to VII, FD generates a frequency up signal. If the successive data transitions occur from IV to I or from VII to V, FD generates a frequency down signal. ### 4. Circuit Design Fig. 7 shows schematic of the phase detector with four up/down signals. Eight data are sampled with not a D-F/F but a sampler in the each positive edge of eight clocks (clk0,clk0b, ~clk3, clk3b). And eight XOR gates are used to detect data transitions. Then four switches were used to generate four up/down signal which are Pup1, Pdn1, Pup2, and Pdn2. Four switches are individually controlled with clk0, clk1, clk2, and clk3 signal. Fig. 8 shows schematic of the frequency detector. Frequency detector logic uses Pup1, Pup2, and Pdn2 signals generated by the phase detector. Pup1 signal is used to disenable the frequency detector. 그림 7. PD의 회로도 Fig. 7. Schematic of the PD. 그림 8. FD의 회로도 Fig. 8. Schematic of the FD. 그림 9. 지연 버퍼의 회로도 Fig. 9. Schematic of the Delay buffer. The delay buffer used in VCO is shown in Fig. 9. Because the effective resistance of the load elements changes with the bias voltage, the buffer's delay varies with the bias voltage Vbp. This bias voltage is generated by the feedback network of the phase frequency detector, charge pump and loop filter. These load elements lead to good control over delay and high dynamic supply noise rejection<sup>[5]</sup>. Four delay stages in the VCO are locked at 2 -phase shift. Thus, for a data rate of 3.125Gb/s with half-rate PFD operation, the VCO generates a 1.5625GHz clock signal. Thus the delay value in the VCO is nominally set to be 80ps. Because the operating voltage swing level of the core VCO-cell is 그림 10. VCO 와 다중위상 클록의 생성 Fig. 10. VCO and multiple-phase clock generation. small, the full-swing circuit was added to drive the PFD circuit. The whole VCO circuit diagram is shown in Fig. 10. ## III. Experimental Results The design was simulated and implemented with TSMC 0.18um CMOS technology. The microphotograph of the chip is shown in Fig. 11. The input data rate of 3.125Gbps with 1.5625GHz recovered clock was the design target. The multiphase clocks are generated from the VCO of 4 delay cells. When CDR is locked at 3.125Gbps random input data, Fig. 12 shows the simulated CDR control signals which is required to be locked state and Fig. 13 illustrates the simulated recovered clock, input data, and the recovered output data. Since the FD UP/DOWN signals can turn off the charge-pump of the FD when the frequency of input data are locked, the charge-pump of the FD consumes no more power. But two charge-pumps of the PD require continuous power consumption because the PD adopts the bang-bang operation. 그림 11. CDR의 마이크로사진 Fig. 11. Microphotograph of the CDR. 그림 12. CDR 조정 신호 모의실험 결과: (a) VCO 조정 전압, (b)Pup1, (c)Pdn1, (d)Pup2, (e)Pdn2, (f)Fup, (g)Fdn. Fig. 12. Simulated CDR control signal (a)VCO control voltage, (b)Pup1, (c)Pdn1, (d)Pup2, (e)Pdn2, (f)Fup, (g)Fdn 그림 13. CDR 출력 신호의 모의실험 결과: (a) 복원된 클록, (b)랜덤 입력 데이터 (c) 복원 된 입력 데이터 Fig. 13. Simulated CDR output signal: (a) Recovered clock, (b)Random input data (c) Recovered input data. The VCO tuning range is from 1.2GHz to 1.9GHz. The stable VCO gain was obtained in the range of 1.37GHz to 1.75GHz. The simulated tracking range is 24% from 3.125Gbps. This wide tracking range can be compared to the 4–5% in other references [2,3]. The CDR has 1.8volt single power supply. The power dissipation is about 140mW. 그림 14. 측정된 출력 데이터(위쪽) 와 복원된 클록 신호 (아래쪽)(가로축 눈금 사이즈 = 400ps) Fig. 14. Measured output data (upper) and recovered clock signal (lower) (One horizontal grid size= 400ps). Fig. 14 shows the measure output eye diagram for 3.125Gb/s data rate with 2exp(31)-1 PRBS and recovered 1.56GHz clock. The measured clock jitter (p-p) is about 15ps. The bit error rate is less than 10exp(-11). The core chip area without pads is 0.6mmx0.4mm. The implemented CDR is suitable for SERDES for optical communications and 10G Ethernet. ## IV. Conclusion We described a clock and data recovery circuit with a half rate 4X oversampling PD and FD without a reference clock. The CDR circuit with PD and FD technique of 4X oversamping can find the synchronization between input data and clock by performing digital logic operations and recover the consecutive data without an additional circuit or insertion of predefined signals. Since the circuit utilizes the digital logic for PD and FD function, it has better portability to different Experimental results show that the circuit is capable of recovering clock and data at rates of 3.125Gbps with 0.18 um CMOS technology. The measured clock jitter (p-p) is about 14ps. The CDR has 1.8volt single power supply. The power dissipation is about 140mW. #### References - [1] Jun-Young Park, Jin-Ku Kang "A 1.0Gbps CMOS Oversampling Data Recovery Circuit with Fine Delay Generation Method," IEICE Transactions on Fundamentals vol E83-A, No.6, June 2000. - [2] Hee-sop Song, Hyung-wook Jang, Sung-sop Lee, and Jin-ku Kang "A 1.25Gb/s clock recovery ciruit using half-rate 4X-Oversamplin PFD", ISOCC '04, Oct. 26, 2004. - [3] M. Ramezani, C. Andre, and T. Salama, "A 10Gb/s CDR with a Half-rate Bang-Bang Phase Detector", Circuits and Systems, 2003. ISCAS '03. Proceedings of the 2003 International Symposium on ,Volume: 2, 25-28 May, 2003. - [4] Rezayee. A, and Martin, K., "A 9-16Gb/s clock and data recovery circuit with three-state phase detector and dual-path loop architecture", European Solid-State Circuits, 2003. ESSCIRC '03. Conference on , 16-18 Sept. 2003. - [5] John G. Maneatis, "Low-jitter processindependent DLL and PLL Based on Self-Biased Techniques," IEEE Journal of Solid State Circuits, Nov, 1996. - [6] Hyung-wook Jang, and Jin-ku Kang "A 3.125Gb/s reference-less clock recovery ciruit using 4X-Oversampling", 전기전자학회논문지 'Vol 10, No. 1, 2006. - 저 자 소 개 - 이 성 섭(정회원) 1997년 인하대학교 전자공학과 학사 2004년 인하대학교 전자공학과 석사 2006년~현재 하이닉스 반도체 설계연구원 <주관심분야 : High speed interface, PLL, Clock Data Recovery > 강 진 구(정회원) 1983년 서울대학교 공학사. 1990년 New Jersey Institute of Technology 전기및 컴퓨 터공학 석사 1996년 North Carolina State l 1996년 North Carolina State University 전기 및 컴퓨터 공학 박사. 1983년~1988년 삼성전자(반도체) 1996년~1997년 미국 INTEL Senior Design Engineer. 1997년 3월~현재 인하대학교 전자전기공학부 교수 <주관심분야: 고속 CMOS회로설계. 혼합모드 회로설계, PLL, CDR, High Speed Interface 회로설계, Display IC 설계>