# A 26.5 Gb/s Optical Receiver With All-Digital Clock and Data Recovery in 65nm CMOS Process

Sang-Hyeok Chu<sup>1</sup>, Woorham Bae<sup>1</sup>, Gyu-Seob Jeong<sup>1</sup>, Jiho Joo<sup>2</sup>, Gyungock Kim<sup>2</sup> and Deog-Kyoon Jeong<sup>1</sup>

<sup>1</sup>Department of Electrical Engineering, Seoul National University, Seoul, Korea <sup>2</sup>Electronics and Telecommunications Research Institute, Daejeon, Korea

E-mail : dkjeong@snu.ac.kr

*Abstract*—This paper presents a 26.5 Gb/s optical receiver with an all-digital CDR (ADCDR) fabricated in a 65 nm CMOS process. The receiver consists of a transimpedance amplifier (TIA), a limiting amplifier (LA), and a half-rate ADCDR. The TIA and LA are based on an inverter-based amplifier for low power consumption. The ADCDR adopts an LC quadrature digitally controlled oscillator (LC-QDCO) for the quadrature sampling. The recovered clock jitter is 1.28 psrms and the measured jitter tolerance exceeds the tolerance mask specified in IEEE 802.3ba. The receiver sensitivity is measured to be -9 dBm and -6.6 dBm for the data rate of 25 Gb/s and 26.5 Gb/s, respectively. The whole receiver chip occupies an active area of 0.75 mm<sup>2</sup> and consumes 254 mW at the data rate of 26.5 Gb/s.

Keywords—optical, receiver, transimpedance amplifier (TIA), all-digital clock and data recovery (ADCDR), LC oscillator, quadrature digitally controlled oscillator (QDCO)

# I. INTRODUCTION

As the required per-pin data rate is steadily increasing, a copper-based interconnection has now become a speed bottleneck due to its limited bandwidth. In this situation, an optical interconnection is gaining more and more interest as a candidate for next-generation board-to-board and chip-to-chip interconnections, which can be attributed to its much higher available bandwidth. Although, due to some practical limits, a chip-level optical interconnection is not realized widely yet, a lot of researches on optical devices compatible with CMOS technologies and hybrid integration of optical and electrical devices from different processes is actively underway these days [1]. Reflecting this trend, various optical specifications such as 40/100 GbE, SONET/SDH, SFI, and QSFP have been established for the high-speed inter-rack/board interconnections. And it is expected that optical specifications for the chip-tochip applications will also appear in the near future.

On the other hand, as the technology continuously scales down, it is getting more difficult to achieve high-performance in an analog scheme due to non-linear characteristics of the transistors and the reduced voltage headroom. Rather, the advanced deep-submicron CMOS processes are quite suitable for digital circuits and a lot of digital approaches for a PLL and a CDR have been tried and proposed obviating the use of large loop-filter capacitors and exhibiting a PVT-invariant loop characteristic [2], [3]. However, for optical interfaces, the PLL and CDR have to operate up to several tens of GHz and there



Fig. 1. The optical receiver architecture.

have been little research on such high-speed all-digital PLLs or CDRs. Thus, in this work, we propose a highly migrable receiver architecture for the optical interface with a low-power optical front-end and an all-digital CDR (ADCDR) operating up to 26.5 Gb/s, which targets an application for higher bandwidth than existing standards.

The rest of the paper is organized as follows. Firstly, the overall architecture of a proposed optical receiver is briefly explained in section II. And then, detailed descriptions of circuit implementations in each component follow. In section IV, measurement results are shown. Finally, the overall performance of the optical receiver is summarized in section V.

### II. RECEIVER ARCHITECTURE

Fig. 1 shows a block diagram of the implemented optical receiver which consists of a transimpedance amplifier (TIA), a limiting amplifier (LA), a half-rate ADCDR, and a PRBS verifier. The current signal received from a photodiode (PD) is converted to the voltage signal by the TIA, and then it passes through the LA to achieve a signal amplitude sufficient for a reliable sampling operation. Four parallel samplers following the LA perform the edge and data sampling of the incoming data stream and these edge and data samples are multiplexed for further processing. The timing of the samplers is controlled

The authors are with the Department of Electrical and Computer Engineering and Inter-University Semiconductor Research Center (ISRC), Seoul National University, Gwanak-Gu, Seoul 151-747, Korea (e-mail dkjeong@snu.ac.kr).

by four-phase clocks which are provided by a quadrature LC digitally controlled oscillator (LC-QDCO). The phases of the clocks are adjusted by a proportional path and an integral path which are separated to each other for the reduced loop latency [4]. The divided-by-32 QDCO clock is compared with the reference clock in the digital domain for an initial frequency lock.

## III. CIRCUIT IMPLEMENTATION

#### A. Optical Front-End

An optical receiver front-end is the first element which meets an optical-electrical interface and, thus, the overall performance of the optical receiver is mainly dependent on that of the front-end. The entire architecture of an optical receiver front-end proposed in this work is described in Fig. 2. The TIA converts an incoming photocurrent from the PD to a voltage output. After that, a single-to-differential converter (S2D) transforms the single-ended TIA output to the differential outputs. Following the S2D, the LA provides an additional amplification to make the signal amplitude to a detectable level for the CDR. And as shown in the figure, an offset cancellation circuit is inserted between the LA input and the output forming a negative feedback loop, thereby prohibiting the offset amplification which may result in a situation that one output goes to  $V_{DD}$  and the other to GND. After the LA, a CML-based buffer with an inductor shunt-peaking is employed in order to drive samplers.

There have been various approaches to the TIA design and their main target was to lower the input resistance of the TIA, thereby accommodating a very large PD capacitance. However, as the supply voltage scales down, conventional topologies such as common-gate (CG) and regulated-cascode (RGC) are becoming more difficult to design due to the reduced voltage headroom. Fortunately, thanks to the advancement in photonic devices, the PD capacitance has continuously shrunk and a typical capacitance value is several tens of fF these days. In this situation, a compact but powerful TIA topology shown in Fig. 2 was proposed in [5] and is now widely used for highperformance optical receivers. In this work, the TIA is designed with the same topology as in [5]. The simulated transimpedance gain and bandwidth are 40 dB $\Omega$  and 30 GHz, respectively. And the simulated input-referred noise current is 3.5  $uA_{rms}$ , which corresponds to BER of  $10^{-12}$  for the input optical power of -10 dBm.



Fig. 2. Overall receiver front-end architecture.



Fig. 3. Implementation of S2D.



Fig. 4. Inverter-based LA with resistive feedback.

For a supply noise rejection, the differential signaling is employed in the entire front-end architecture except for the TIA. For the single-to-differential conversion at the very high speed, as shown in Fig. 3, the differential amplifier with the inductor shunt-peaking is used, which provides both the singleto-differential conversion and the signal amplification. The DC level of the TIA output,  $V_{DC}$  is extracted by a passive RC filter and PMOSs operating in the linear region are used as a resistive load. A negative feedback loop with an inverter replica forces the S2D output DC level to be optimal for the next LA stage.

The LA is composed of 4-stage pseudo-differential inverter-based amplifiers as depicted in Fig. 4. Fundamentally, because the inverter utilizes  $g_m$  of both the NMOS and the PMOS, it can achieve high  $g_m$  consuming smaller static power than the conventional differential amplifier with the tail current source. By inserting feedback resistors in every other inverter manner, the bandwidth of the inverter-based amplifier can be maximized at the cost of the reduced voltage gain. In order to mitigate a large capacitance seen at the S2D output more, negative capacitances are also employed as shown in the figure.

At the operation speed of 25 Gb/s, the combined TIA, S2D and LA front-end block exhibits a total gain of 71 dB $\Omega$  and the power consumptions of the TIA, S2D and LA are 2.1 mW, 5.7 mW and 9.9 mW under the supply voltage of 1 V, 1.8 V and 1 V, respectively.

# B. ADCDR

The implemented half-rate ADCDR is composed of four samplers, two 2:64 DMUXs, a phase detection logic, a digital logic including loop filter and a LC-QDCO as shown in Fig. 1.

The implementation of the samplers and the phase detection logic is described in Fig. 5. The XOR operations between consecutive edge and data samples generate up[1:0]



Fig. 5. Samplers and phase detection logic.



Fig. 6. Digital loop filter.

and dn[1:0] which adjust the timing of the QDCO. Two buffers are added on the earlier input path of the XOR gate to align edge and data samples as presented in [6]. The output of each XOR gate is latched and delivered to a proportional capacitor bank of the QDCO. All the circuits for the samplers and the phase detection logic are implemented in a current-mode logic.

A digital loop filter performs XOR operations between incoming deserialized samples and integrates the difference of up and down bits as presented in Fig. 6. The integral gain of the loop filter can be adjusted flexibly without compromising the chip area in contrast with its analog counterpart.

In the half-rate CDR, quadrature clocks are required for sampling the data four times during one clock period. There are several ways to obtain the clocks which are in a quadrature relation. Ring oscillators can directly generate the quadrature clocks without additional circuitry or power consumption. However, its low phase noise FOM at tens of GHz disgualifies this choice [7]. A more attractive method is to let two symmetrical LC oscillators be coupled to each other to oscillate in quadrature [8], [9]. In the first implementation which relies on this principle, the coupling transistors are placed in parallel with the switch transistors as described in [8]. However, it exhibits the poor phase noise performance due to the reduced effective Q factor of the LC tank. Although it is possible that the phase noise performance can be improved by changing a coupling factor or by introducing additional phase shifters, they may lead to the degradation of phase error immunity to component mismatches or increased power consumption. Another implementation proposed in [9] improved phase noise without any expense. In this work, four oscillation nodes of two symmetrical LC-DCOs are coupled via NMOS transistors which are in the bottom of the oscillator as presented in [9] for low phase noise.



Fig. 7. LC QDCO.

Fig. 7 shows the circuit diagram of the implemented LC-QDCO. Two center-tapped 495pH inductors are used. The oscillation frequency is tuned via two capacitor banks. The integral capacitor bank is composed of fine and coarse tuning cells [10]. The fine and coarse tuning cells consist of a single PMOS capacitor and four PMOS capacitors in parallel, respectively. And each cell is accompanied by a local decoder. The 8-MSBs of the 10-bits frequency control word (FCW) from the DLF are thermometer-decoded in the coarse tuning cells and the 2-LSBs are thermometer-decoded in the fine tuning cells. In this way, parasitic effects are greatly reduced compared to the way that the whole FCW bits are thermometer-decoded at the expense of slightly decreased tuning linearity. The proportional capacitor bank consists of binary weighted PMOS capacitors in parallel which are selectively activated for the proportional gain control. It is adjusted by the output of the phase detection logic, up[1:0] and dn[1:0]. The center frequency of the QDCO can be adjusted by switching MIM capacitors. By implementing the current source of the oscillator with PMOS, the common mode voltage of the output clock can be sufficiently lowered for the proper operation of the subsequent CML clock buffers.

At the operation speed of 25 Gb/s, the whole ADCDR consumes 218mW under the supply voltage of 1.2V.

#### IV. EXPERIMENTAL RESULTS

The optical receiver prototype is fabricated in a 65 nm CMOS technology. The active area of the receiver chip is 1 x 0.75 mm<sup>2</sup> and its chip micrograph is shown in Fig. 8. Instead of using a photodiode (PD), a series resistor is placed at the TIA input and thus, the voltage input can be converted to the current, which well emulates an optical-electrical interface. The total measured power of the receiver chip is 254 mW including the power dissipations of all the clock and data buffers. The rms and peak-peak jitter of the recovered clock are measured to be 1.28 psrms and 8.9 pspk-pk, respectively as shown Fig. 9. Due to the limit in a test equipment, the jitter tolerance is measured with a 12.5 Gb/s PRBS  $2^7$ -1 as in [6]. As shown in Fig. 10, the measured jitter tolerance well exceeds the tolerance mask (dashed-line) specified in IEEE 802.3ba for 40/100 GbE. By detecting an error in the recovered data, the optical sensitivity through the entire receiver is also measured with reasonable



Fig. 8. Chip micrograph.



Fig. 9. Measured jitter histogram of recovered clock (divided by 32).



Fig. 10. Measured jitter tolerance.



Fig. 11. Measured receiver sensitivity.

assumptions that the extinction ratio and the PD responsivity are given as 6 dB and 0.7 A/W, respectively. Fig. 11 shows that, at 25 Gb/s, -9 dBm of optical power is sufficient for BER of  $10^{-12}$  and at the maximum data rate of 26.5 Gb/s, the sensitivity is measured to be -6.6 dBm. The performance of the front-end

TABLE I. PERFORMANCE COMPARISON OF FRONT END.

|                                           | This work           | [5]             | [11]              | [12]           |
|-------------------------------------------|---------------------|-----------------|-------------------|----------------|
| Data rate [Gb/s]                          | 26.5                | 25              | 25                | 28             |
| Power / ch [mW]                           | 35.7                | 44.4            | 69                | 28.8           |
| Energy-efficiency<br>[pJ/bit]             | 1.35                | 1.78            | 2.76              | 1.03           |
| Sensitivity (10 <sup>-12</sup> )<br>[dBm] | -6.6<br>@ 26.5 Gb/s | -4<br>@ 22 Gb/s | -6.8<br>@ 25 Gb/s | -7<br>@ 28Gb/s |
| CMOS technology                           | 65nm                | 90nm            | 65nm              | 28nm           |

is summarized and compared with other works in Table I, and it indicates that the proposed front-end is competitive with the other state-of-the-art works.

## V. CONCLUSION

In this work, an optical receiver operating up to 26.5 Gb/s is designed in CMOS technology. By adopting low-power inverter-based amplifiers in the overall front-end architecture, the TIA and the LA exhibit excellent performances. With the LC-QDCO utilized as a quadrature clock generator, the ADCDR is realized for a high-speed operation showing the possibility of all-digital implementations of PLLs and CDRs for future optical interface circuits. Also, the prototype well satisfies the jitter tolerance specification for 40/100 GbE and exhibits high-sensitivity up to -9 dBm at the data rate of 25 Gb/s.

# REFERENCES

- M. Hochberg et al., "Silicon photonics: the next fabless semiconductor industry," *IEEE Mag. Solid-State Circuits*, 5(1), 48-58, 2013.
- [2] R. B. Staszewski et al., "All-digital TX frequency synthesizer and discrete-time receiver for Bluetooth radio in 130-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 39, no. 12, pp. 2278–2291, Dec. 2004
- [3] D.-H. Oh et al., "A 2.8 Gb/s all-digital CDR with a 10 b monotonic DCO," ISSCC Dig. Tech. Papers, pp. 222–223, 2007.
- [4] R. C. Walker et al., "A two-chip 1.5-GBd serial link interface," *IEEE J. Solid-State Circuits*, vol. 27, no. 12, pp. 1805–1811, Dec. 1992.
- [5] J. Proesel et al., "25Gb/s 3.6pJ/b and 15Gb/s 1.37pJ/b VCSEL-based optical links in 90nm CMOS," *ISSCC Dig. Tech. Papers*, pp. 418-420, 2012.
- [6] J. Kim et al., "A Fully Integrated 0.13-um CMOS 40-Gb/s Serial Link Transceiver," *IEEE J. Solid-State Circuits*, vol. 44, no. 5, pp. 1510–1521, 2009.
- [7] B. Razavi, "A study of phase noise in CMOS oscillators," *IEEE J. Solid-State Circuits*, vol. 31, pp. 331–343, Mar. 1996.
- [8] A. Rofougaran et al., "A 900 MHz CMOS LC-oscillator with quadrature outputs," *ISSCC Dig. Tech. Papers*, pp. 392–393, 1996.
- [9] P. Andreani, "A 2 GHz, 17% tuning range quadrature CMOS VCO with high figure-of-merit and 0.6 phase error," in *Proc. ESSCIRC*, Sept. 2002.
- [10] N. Da Dalt et al., "A 10b 10GHz Digitally Controlled LC Oscillator in 65nm CMOS," ISSCC Dig. Tech. Papers, pp. 329–330, 2006.
- [11] J. Y. Jiang et al., "100Gb/s ethernet chipsets in 65nm CMOS technology," ISSCC Dig. Tech. Papers, pp. 120-122, 2013.
- [12] T. C. Huang et al., "A 28Gb/s 1pJ/b shared-inductor optical receiver with 56% chip-area reduction in 28nm CMOS," *ISSCC Dig. Tech. Papers*, pp. 143-145, 2014.