## 10.4 A 12Gb/s 0.9mW/Gb/s Wide-Bandwidth Injection-Type CDR in 28nm CMOS with Reference-Free Frequency Capture

Takashi Masuda<sup>1</sup>, Ryota Shinoda<sup>1</sup>, Jeremy Chatwin<sup>2</sup>, Jacob Wysocki<sup>2</sup>, Koki Uchino<sup>1</sup>, Yoshifumi Miyajima<sup>3</sup>, Yosuke Ueno<sup>1</sup>, Kenichi Maruko<sup>1</sup>, Zhiwei Zhou<sup>1</sup>, Hideyuki Matsumoto<sup>1</sup>, Hideyuki Suzuki<sup>1</sup>, Norio Shoji<sup>1</sup>

<sup>1</sup>Sony, Tokyo, Japan, <sup>2</sup>Mixed Signal Systems, Scotts Valley, CA, <sup>3</sup>Sony LSI Design, Kanagawa, Japan

The consumer electronics market demands high-speed and low-power serial data interfaces. The injection locked oscillator (ILO) based clock and data recovery (CDR) circuit [1-2], is a well-known solution for these demands. The typical solution has at least two oscillators: a master and one or more slaves. The master, a replica of the data path ILO, is part of a phase locked loop (PLL) used to correct the oscillator free-running frequency (FRF). The slave ILO phase locks to the incoming data but uses the frequency control from the master. Any FRF difference between the master and slave, such as that caused by PVT or mismatch, reduces the receiver performance. One solution to the reduced performance [3] uses burst data and corrects the FRF between bursts. However, for continuous data, injection forces the recovered clock frequency to match the incoming data rate, masking any FRF error from the frequency detector. Existing solutions [4-5] use a phase detector (PD) to measure the FRF. However, any static phase offset between the PD lock point and the ILO lock point causes the frequency control algorithm to converge incorrectly. Static phase offset can be caused by mismatch, PVT, or layout.

This paper describes an ILO-type CDR, called the frequency-capturing ILO (FC-ILO), that eliminates the master oscillator and combines the ILO and PLL [6] type CDRs, realizing the benefit of both. The ILO gives wide bandwidth and fast locking while the PLL gives wide frequency capture range. The CDR architecture, shown in Fig 10.4.2, has a half-rate ILO, data and edge samplers making a bang-bang phase detector (BBPD), two 2:10 demuxes, and independent digital phase and frequency control. The ILO is made from current-starved inverters and driven by an edge detector. The ILO has coarse and fine frequency tuning. The strength of the unit inverter of the oscillator is adjusted for coarse tuning, keeping the normalized gain and delay constant over a wide range of frequencies. A current DAC is used for fine tuning. The edge detector shorts the ILO differential nodes together to align clock and data transitions. The BBPD outputs are used by the digital phase and frequency control to determine if ILO edges are early or late with respect to the incoming data and to correct the ILO FRF. A variable delay circuit controls the timing between data and clock inputs to the BBPD, correcting the static phase offset between the PD and ILO lock points.

The phase/frequency control algorithm receives information only about the relative phase of clock and data edges but must distinguish whether the timing error is caused by static phase offset or incorrect ILO FRF in order to determine the type of correction to apply. To make such a determination, the digital algorithm considers the result of two consecutive BBPD outputs, as shown in Fig. 10.4.3. If the ILO has no FRF error, then consecutive outputs will have error due only to static phase offset and, hence, the same sign. A change in the sign of consecutive outputs must be due to FRF error, because frequency is the derivative of phase. The table in Fig. 10.4.3 shows the algorithm that distinguishes FRF error and static phase offset to guarantee correct convergence. Further, the phase/frequency correction algorithm must function for a variety of data patterns. For instance, if the data consists of alternating 1s and 0s and the ILO injects on both rising and falling edges of the input data, then phase error due to FRF error is constantly corrected by the injection. The residual phase error is due only to static phase offset, so FRF error is undetectable. However, FRF error can be detected if some phase error due to FRF error is passed to the BBPD by skipping injection on alternating data edges. This algorithm provides a key benefit for the FC-ILO. In a conventional ILO type CDR, the relative timing of the clock and data paths is controlled with matched loads and dummy cells to maximize timing margin given PVT variation and mismatch. With the FC-ILO, the clock of the edge sampler is aligned to the data edge by the digital control algorithm, optimizing timing margin for the eye center sampler and, thus, improving jitter tolerance (JTOL).

A key benefit of the digital frequency control algorithm is wide capture range. To quantify the impact of the algorithm, the capture range is measured with and without the BBPD loop. When the BBPD loop is disabled, the ILO FRF can be set manually by a fixed DAC code. With 10Gb/s input data, the CDR is error free if the FRF is set within -2.6% and +3.2% of 5GHz (half rate). With the BBPD, the loop will correct to 5GHz if the initial FRF is set within -26% to +15% of 5GHz. Figure 10.4.4(a) shows the evaluated waveform in which the BBPD loop corrects an initial FRF of 3.68GHz to 5GHz for 10Gb/s operation. The loop response time is set by the loop gain, which can be adjusted digitally. For very fast locking, the oscillator FRF can be calibrated to the optimum code before receiving data. After calibration, the CDR can lock within 2UI, as shown in Fig. 10.4.4(b).

Figure 10.4.1 shows the block diagram of the test chip. It consists of 4 lanes, each with a receiver (an AFE, CDR, and two Demuxes), digital control logic, and a test block. The AFE compensates 5dB of loss at 6GHz to support 12Gb/s data over a target channel with a PCB and 30cm (AWG40) of fine coaxial cable. The digital block detects and controls the ILO FRF and the BBPD input phase. The test block serializes the sampled data using the half-rate clock of the ILO and drives the output channel. The bit error rate can be measured using an on-chip, internal checker or an external BERT.

The test chip is implemented in a 28nm CMOS process. Its size is 3.2mm<sup>2</sup>, including pads, as shown in Fig. 10.4.7. The receiver is 0.11mm<sup>2</sup> including clock buffer and test circuits. The receiver operates from 1 to 12Gb/s with the coarse and fine tuning. At 12Gb/s, each receiver consumes 22.9mW and the CDR consumes 11mW. Including the power of frequency and phase control loops, removal of the master PLL reduces CDR power consumption by around 20% for a 4-channel chip. The JTOL, shown in Fig. 10.4.5, for a PRBS9 8B10B pattern is evaluated after transmission across the target channel using an N4903B as a jitter generator and error detector. The JTOL performance exceeded measurement limits for this test equipment for all frequencies except 300MHz (shown in dasheddiamond curve). Therefore the internal error checker was used (shown in solid-circle). From 60MHz to 300MHz the jitter generator created jitter exceeding the CDR JTOL, allowing direct evaluation of JTOL. Below 60MHz, the jitter generator was unable to generate sufficient jitter amplitude to evaluate JTOL, so the JTOL at lower frequencies is estimated by extrapolation (dashed line). The JTOL reaches  $0.56UI_{\text{pp}}$  at 300MHz including an  $R_{j}$  of  $0.18UI_{\text{pp}}$  at a BER of 1E-9 and excluding ISI jitter generated by the channel. The JTOL floor is not shown because the ILO bandwidth exceeds the measurement equipment bandwidth, but simulations indicate that 0.56UI is close to the minimum JTOL. Thus the CDR shows the merits of both ILO and PLL type CDR: fast locking and wide capture range, shown in Fig. 10.4.4, and wide bandwidth, shown in Fig. 10.4.5. The receiver performance is summarized in the table shown in Fig. 10.4.6.

## References:

[1] J. Lee et al. "A 20Gb/s Burst-Mode CDR Circuit Using Injection Locking Technique," *ISSCC Dig. Tech. Papers*, pp. 46-47, Feb. 2007.

[2] K. Maruko et al., "A 1.296-to-5.184Gb/s Transceiver with 2.4mW/(Gb/s) Burst mode CDR using Dual-Edge Injection-Locked Oscillator," *ISSCC Dig. Tech. Papers*, pp.364-365, Feb. 2010.

[3] J. Terada et al., "A 10.3125Gb/s Burst-Mode CDR Circuit using a Delta-Sigma DAC," *ISSCC Dig. Tech. Papers*, pp. 226 - 227, Feb. 2008.

[4] C.-F. Liang et al., "A Reference-Free, Digital Background Calibration Technique for Gated-Oscillator-Based CDR/PLL," *IEEE Symp. VLSI Circuits*, pp. 14-15, Jun. 2009.

[5] M. Raj et al., "A 4-to11GHz injection-locked quarter-rate clocking for an adaptive 153fJ/b optical receiver in 28nm FDSOI CMOS," *ISSCC Dig. Tech. Papers*, pp. 404-405, Feb. 2015.

[6] T. Masuda et al., "A 250mW Full-Rate 10Gb/s Transceiver Core in 90nm CMOS using a Tri-State Binary PD with 100ps Gated Digital Output," *ISSCC Dig. Tech. Papers*, pp. 438-439, Feb. 2007.



## **ISSCC 2016 PAPER CONTINUATIONS**

