# A 32 mW 1.25 GS/s 6b 2b/Step SAR ADC in 0.13 $\mu$ m CMOS

Zhiheng Cao, Shouli Yan, Member, IEEE, and Yunchu Li, Member, IEEE

Abstract—A 1.25 GS/s 6b ADC is implemented in a 0.13  $\mu$ m digital CMOS process by time-interleaving two SAR ADCs with 2.5 GHz internal clock frequency that converts 6 bits in 3 cycles. 5.5b ENOB at 1.25 GS/s and 5.8b ENOB at 1 GS/s are achieved without any off-line calibration, error correction or post processing. The entire ADC consumes 32 mW at 1.25 GS/s including T/H and reference buffers, and occupies 0.09 mm<sup>2</sup>.

Index Terms—Analog-to-digital conversion, CMOS analog integrated circuits, low-power electronics, switched-capacitor circuits.

# I. INTRODUCTION

T HERE has been a continued trend in telecommunication standards moving from narrow band to wide band, from lower to higher carrier frequencies. While the AM radio used only 10 kHz for each channel with a carrier frequency of several hundred kHz, UWB (ultra-wideband) would use 500 MHz per each channel with a carrier frequency of  $3 \sim 10$  GHz, and 24 GHz or 60 GHz in the near future [1]. As bandwidth widens, more thermal noise falls in-band, and since transmission power is limited, inevitably the SNR reduces. The fact that the phase noise of local oscillators (LO) generally degrades as carrier frequency increases is another reason why the SNR of transmitted or received signal continues to decrease for newer standards.

While high resolution ADCs are needed in conventional narrowband wireless standards to move more baseband filtering functions from analog to digital domain, it is predicted that the demand for high-speed, low-resolution ADCs will increase in the future The flash architecture has been dominantly used for these ADCs. However, due to the requirement of many parallel comparators and preceding pre-amplifiers to reduce offset and input capacitance, the lowest reported power consumption for 6-bit > 1 GS/s flash ADC that does not require off-line calibration is about 160 mW [2]. It should be noted that much lower power consumption can be achieved if off-line calibration is used to eliminate comparator mismatch [3], [4].

This paper presents a new type of ADC that takes advantage of the high speed digital logic and highly matched small capacitors [5] in standard nanometer digital CMOS processes to

Z. Cao is with Qualcomm, San Diego, CA 92121 USA (e-mail: zcao@qualcomm.com).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSSC.2008.2012329

achieve 1.25 GS/s, 6-bit performance with much lower power consumptions and smaller die area than flash ADCs [6]. Unlike many previously published low-power high-speed ADCs such as [3], [7] and [8], this ADC achieves 6-bit accuracy without digital post-processing or off-line calibration, making it a drop-in replacement for existing ADCs in many applications.

The rest of this paper is organized as follows. We start from the architecture of the ADC and the motivations behind in Section II, and move on to circuit and implementation details in Section III. Section IV explains the approaches used to characterize the test chip. Measurement results are presented in Section V and conclusions made in Section VI.

#### II. ARCHITECTURE

It is known that the energy per conversion of SAR ADCs is approximately linearly proportional to the resolution (when the number of bits is small and neither mismatch nor thermal noise is a concern,) while that of flash ADC is exponentially proportional [9]. This implies that SAR is more energy efficient than flash only when the number of bits is larger than a certain threshold, below which flash becomes more efficient ([9 in Fig. 4(a)]). It is also well known that when the clock frequency is close to the upper limit of a certain process, energy per operation increases dramatically and a little reduction of speed can be traded for large power savings ([10 in Fig. 5]). When these factors are considered, for small number of bits it may be less efficient to use SAR with a very high internal clock frequency than to use flash.

If we view one SAR conversion process as a cascade, in time domain, of multiple SAR conversion processes with a smaller number of bits, we can see the possibility of increasing sampling frequency and/or power efficiency by replacing each sub-conversion process with the flash architecture.

Fig. 1 illustrates an example conversion process of the proposed architecture that combines SAR and flash. Suppose analog input "39" was sampled on all capacitors. The next phase 3 reference levels for the flash are generated by (001), (011) and (111) inputs to the capacitors sized "16" (and (110), (100), (000) on the negative side). All the other capacitors are connected to 0 (and 1 on the negative side). As a result, the top 2 bits are converted by the 3 comparators. In the 2nd conversion step, the previous comparator outputs are connected to each "16" capacitors, converting the middle 2 bits. This continues and finally the 6-bit ADC output is obtained by simply using 3 full adders to convert each step's 3 comparators' outputs to 2-bit wide binary format, which is a much simpler process than in

0018-9200/\$25.00 © 2009 IEEE

Manuscript received April 02, 2008; revised August 30, 2008. Current version published February 25, 2009. This work was supported in part by SigmaTel and Silicon Labs.

S. Yan is with the Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX 78712 USA (e-mail: slyan@ece.utexas. edu).

Y. Li is with Analog Devices, Wilmington, MA 01887 USA.



Fig. 1. An example conversion of analog input "39" by combined flash and SAR.

a traditional 6-bit flash ADC where 65 comparators' outputs need to be converted into binary format.

Implemented in UMC digital 0.13  $\mu$ m CMOS, each SAR ADC is clocked at 2.5 GHz, and two bits are determined per each cycle. Including the sampling/autozeroing cycle, four cycles are required to convert 6 bits. two SAR ADCs are time-interleaved to sample at 1.25 GHz.

Fig. 2 shows the block diagram of the proposed architecture. It achieves 6-bit resolution and 1.25 GS/s with only 6 comparators whereas 63 comparators are required in the flash architecture. Even though the clock frequency of each comparator is doubled from 1.25 GHz to 2.5 GHz, the  $> 10 \times$  reduction in number of comparators and the preceding preamps which consume static power results in much decrease in power consumption.

 $2 \times$  time-interleaving is used to double the sampling frequency. In theory more ADCs can be time-interleaved to obtain even higher sampling frequency [7]. In practice, offset, gain



Fig. 2. Overall block diagram of the proposed 1.25 GS/s 6-bit SAR ADC.

and sampling time mismatch create more tones and images of input signals, not to mention the increased area and input load. The tones caused by offset mismatch among the channels are the largest error source, limiting the SNDR to 27 dB in [7]. These tones can be nulled by digitally adding offset until each channel's output has equal averaged value. This, however, also removes any input signal components at these tone frequencies. But for  $2\times$  time-interleaving, offset mismatch creates a tone at fs/2, which is tolerable since no signal exists near this frequency after anti-aliasing filter. Gain and offset mismatch can be relatively easily calibrated out using digital techniques proposed in [11]. Sampling time mismatch is much more difficult to correct using digital techniques. Therefore, a single track and hold (T/H) is used so that the sampling instant is determined by a single 1.25 GHz clock.

The proto-type chip integrates a clock-multiplier PLL to generate the 2.5 GHz internal clock and the 1.25 GHz sampling clock. This PLL is a different research project [12] and consumes around 23 mW. Much lower power consumption is expected if the only requirement is to generate 2.5 GHz from supplied 1.25 GHz clock.

# **III. ENABLING CIRCUITS**

The above architecture is not likely to actually achieve 1.25 GS/s with lower power consumption than flash without several circuit level innovations described in this section.

#### A. Fast Settling Capacitor-Network

Rather than separating the 2-bit quantizer with the capacitor network, this design uses three capacitor networks (C-net), each of which is connected to a comparator (Fig. 2). This is to avoid the use of resistor ladder which draws much static power. Each reference level is generated by capacitive interpolation from two analog reference voltages  $+V_{ref}$  and  $-V_{ref}$  (Fig. 3).

Both in the flash and SAR architecture, to achieve high sampling speed the reference must provide low output impedance. In this design,  $+V_{ref}$  and  $-V_{ref}$  are generated by regulated



Fig. 3. Capacitor network.

source followers. (Fig. 4) The output impedance of this buffer can be calculated as

$$Z_{out} \approx \frac{sC_{gs3} + g_{ds2} + g_{ds1}}{sC_{gs3}(g_{m2} + g_{ds2} + g_{ds1}) + g_{m3}(g_{m2} + g_{ds2} + g_{ds1})}.$$
(1)

We can see at DC this is approximately

$$Z_0 \approx \frac{1}{g_{m3}\left(\frac{g_{m2}}{(g_{ds2}+g_{ds1})+1}\right)} \approx \frac{1}{I_{bias} \times 200},$$
 (2)

assuming reasonable  $V_{dsat}$  of 150 mV and  $g_{m2}/(g_{ds2} + g_{ds1})$  of about 15. This output impedance is maintained until at a zero frequency

$$\omega_z \approx \frac{g_{ds2} + g_{ds1}}{C_{qs3}} \tag{3}$$

when it starts to increase 20 dB/dec until it reaches  $1/gm_2$  at a pole frequency

$$\omega_p \approx \frac{g_{m3}}{C_{qs3}}.\tag{4}$$

In this design, Each reference buffer draws 2 mA, so about 2.5  $\Omega$  output impedance is generated at DC. Similar impedance is generated at frequencies above  $\omega_z$  by large NMOS gate capacitors (with gate area several hundred times of that of M3) connected at the output to ground.

It should be noted that if reference resistor ladders, which are used in conventional flash ADCs, were used to generate this low impedance for each reference level, then the bias current would be  $VDD/10 \ \Omega = 120 \text{ mA}$  for the worst case settling at mid-supply, approximately 60 times larger. To provide power supply rejection for the resistor ladder, additional supply regulator or buffer must be used.

The use of this type of reference buffering is only possible because there are only two analog voltage levels to generate, since these NMOS gate capacitors consume a lot of silicon area. Essentially, the necessary 63 reference levels are generated by capacitor interpolation in this ADC rather than by a resistor ladder in conventional flash ADCs.



Fig. 4. Regulated source followers that generate  $+V_{ref}$  and  $-V_{ref}$ .

Another benefit of this design choice is that closed switches have high over-drive voltages during the comparison cycles, since  $V_{ref} \pm$  are set close to supply rails. This enables much faster settling than an OTA-based switched-capacitor circuit.

The disadvantage of using three C-net is the increased number of capacitors. Fortunately, as process scales the accuracy with which interconnect metal patterns can be defined improves, making it possible to realize highly matched, very small value capacitors using interconnect metals [5]. The 6-bit capacitor network uses  $\sim 5$  fF unit capacitors realized by metal-456 sandwich with 5  $\mu$ m  $\times$  5  $\mu$ m top plate (Fig. 3). The top plate is almost completely encaged by the bottom plates to minimize parasitic coupling. A single bridging capacitor has been used, resulting in a LSB capacitance of only 1.25 fF. The small but well-matched capacitance increases settling speed and reduces power consumption. To avoid unwanted coupling to the C-net top plate, the bottom plate of the bridging capacitor is shielded from signal wirings and the bottom plate of the other capacitors. The parasitic bottom plate capacitance of the bridging capacitor, as long as it is connected to a clean voltage, will not affect the bit weight ratio, because it only results in proportional signal attenuation. The total capacitance is only 240 fF each side<sup>1</sup> which is driven by the front-end T/H. The three capacitor networks occupy 100  $\mu$ m  $\times$  70  $\mu$ m which is about 52% of the layout area.

 $^{1}\mathrm{In}$  reality, the switches contributes a significant portion of capacitance loading the T/H.



Fig. 5. Block and timing diagrams of conventional SA control logic.



Fig. 6. Block and timing diagrams of the proposed SA control logic.

# B. Flip-Flop Bypass SAR Logic

To achieve the 400 ps cycle time, delay of the SAR logic must be reduced so that as much as possible time can be allocated for C-net and quantizer preamp settling.

Fig. 5 shows block and timing diagrams of a conventional successive approximation control logic, which consists of a shift register and an array of DFF forming the successive approximation register (SAR). The comparator starts regeneration at the falling edge of the clock, and the SAR latches the comparator output at the rising edge of the clock. It is assumed that in the worst case (i.e., when the input differential signal is very small,) the comparator needs half clock cycle to regenerate. This results in large wasted cycle time for other cases when the comparator can regenerate more quickly.

Fig. 6 shows block and timing diagrams of the proposed "flip-flop bypass" control logic. The comparators in the quantizer starts regeneration at the rising edge of  $\phi_{LOGIC}$ , which is the latch clock to the comparators as shown in Fig. 2. The 3-bit thermometer coded output of the quantizer is directly passed to the C-net through a multiplexer, without D-FF in between. The sooner the comparator regenerates, the sooner the output is applied to the C-net and the settling of next bit cycle starts. It is only at the falling edge of  $\phi_{LOGIC}$  that the comparator output is written to a D-flip-flop to be used by later SA cycles. If we are lucky at this time settling of next bit cycle has already begun. The C-net as shown in Fig. 3 enables the direct connection of

TABLE I MULTIPLEXER OPERATION

|             | State 4 and 1     | State 2               | State 3               |  |
|-------------|-------------------|-----------------------|-----------------------|--|
| a1∼a3       | (001)/(011)/(111) | Pass quantizer output | Pass D-FF output      |  |
| b1~b3       | (000)             | (001)/(011)/(111)     | Pass quantizer output |  |
| $c1\sim c3$ | (000)             | (000)                 | (001)/(011)/(111)     |  |

the quantizer output. Nine bits  $(a1 \sim a3, b1 \sim b3 \text{ and } c1 \sim c3)$  controls the effective feedback voltage.

All the additional logic such as binary encoding is placed outside of the digital feedback loop where pipelining can be used to increase clock frequency. the critical path logic is shortened to a multiplexer, a NAND gate that drives the C-net switches, and buffers to achieve optimal fan-out. On the other hand, previously proposed "self-timed SAR logic" such as [13] and [8] are similar in principle but contain a completion detection logic that generates a "DONE" signal which clocks a flip-flop. Table I summarizes the operation of the multiplexer.

Fig. 7 shows the simulated waveforms at one of the capacitor array's top plate (input to one of the preamp and quantizer), the clock that latches the comparator (master\_clkb), comparator outputs, comparator slave latch's output (input to SAR logic) and the SAR logic output.

The SAR logic is 100% static CMOS and uses low-Vt transistors for critical paths. Measured digital power consumption (SAR logic + clock buffers) is 8 mA at 2.5 GHz internal frequency.



Fig. 7. Simulated internal waveforms of the SAR converter.



Fig. 8. Schematic diagram of proposed 2-bit quantizer.

#### C. Digital Background Offset Calibration

Unlike pipeline ADC, since there is no residue amplification, and range-overlapping (redundancy) is not used for the sake of simplicity in this design, the 2-bit flash ADC must be 6-bit linear. Because a reasonably sized dynamic latching comparator would have several tens of mV (up to 10 LSB of the 6-bit ADC) of offset, preamplifiers must precede the comparators such that the error is small enough when referred to quantizer input, or offset reduction methods such as calibration or averaging must be used. Since each preamp must have very high bandwidth to obtain enough gain in limited amount of settling time, their DC gain must be low  $(2 \sim 4)$  [14].<sup>2</sup> In conventional 6-bit flash ADCs, cascade of many low gain preamps are used to provide a gain of more than 30 [2]. However, many stages of preamps not only increases power and die area, but also increases delay which is detrimental to the SAR scheme.

This problem is solved by reducing the number of preampstage to only 1 and using offset self-calibration through digital feedback to compensate for the lack of preamp gain. Comparator offset calibration has become a popular technique in recent years for Flash ADCs to reduce power consumption [3]. For example, if we use a 3-bit DAC to reduce maximum possible offset by 8 times, then the comparator size can be shrunk by 1/64, leading to  $64 \times$  power reduction.

However, the flash ADC needs to be put off-line during calibration, and offset that changes during operation cannot be corrected. Unlike flash, where the comparator is always working during operation, the SAR conversion scheme allows the use of comparator output during the auto-zeroing phase when the C-net is sampling input and the top-plates are shorted, to measure the comparator offset, and then compensate during the next auto-zeroing phase (Fig. 8) while running the ADC at full speed.

Since the positive and negative inputs are shorted in this phase, offset of the preamp and the comparator will decide the digital output. The digital output is filtered by a FIR filter and applied to a passive switched-capacitor integrator. The resulting voltage controls an auxiliary amplifier to generate an opposite offset until the comparator outputs equal numbers of -1 and 1 in the auto-zeroing phase. This scheme is based on [15] where a similar scheme has been used for two-step subranging ADC, in which two ADCs are time-interleaved so that each processes the input signal only half of the time, while the outputs of the

<sup>&</sup>lt;sup>2</sup>Theoretically an integrator gives highest possible gain for a given current consumption and settling time, but it requires a pre-charge phase to reset the load capacitor to 0 V, which is not compatible with the proposed flip-flop by-pass SAR architecture.



Fig. 9. Equivalent block diagram of proposed 2-bit quantizer.

other half are used to calibrate offsets. This results in  $2 \times$  area and power penalty whereas in the proposed SAR ADC the calibration is enabled without additional cost.

Due to hysteresis and other non-idealities of the comparator, in steady state, we could see much unnecessary fluctuation at the comparator output even though the average is zero (e.g., 1, -1, 1, -1, ...). Although the switched-capacitor integrator can filter out some of the fluctuations, further filtering is achieved by a simple two tap FIR filter without affecting stability of the loop. A block diagram of the loop is shown in Fig. 9. Cs/Cp is sized such that the compensation step size is about 0.02 LSB.

Due to finite hysteresis in the comparators that makes their offset dependent on previous decision, settling of the calibration loop is not perfect and transient simulation shows that it is only when the comparator differential input becomes about 2.5 mV (1 mV referred to ADC input) that the comparator makes decision to move the correction to the opposite direction. However, since the LSB is about 10 mV, this error only amounts to 0.1 LSB.

#### D. High-Speed Low-Hysteresis Comparator

To enable the proposed architecture, the comparator must be both high speed, low latency and have very little hysteresis, i.e., its offset mostly being static and not affected by the previous decision. Even though the clock frequency is 2.5 GHz, to allow time for settling of the digital logic block and C-net, the latency of the comparator must be far less than 400 ps.

Fig. 10 shows the schematic diagram of the comparator, which consists of a master dynamic comparator and slave static CMOS latch (cross coupled NAND). To reduce hysteresis, extra reset switches have been added to ensure every node in the master dynamic comparator is completely precharged to VDD before the latch signal goes high.

Because of the way the flip-flop bypass SA logic works, the comparator must hold the decision until the next latch edge, therefore a slave latch, typically realized by cross-coupled NAND gate for this type of comparator, must be used.

The outputs of the master dynamic comparator that goes directly to the slave latch were connected to the bottom NMOS rather than top NMOS that is exposed to NAND gate outputs. This is because depending on the NAND gate output being high or low (i.e., the previous comparator decision), the MOS channel is not completely formed (with drain = VDD and source = VDD - Vth) or formed (with drain/source = VSS), and hence presenting different load capacitance to the master dynamic comparator, resulting in hysteresis. On the other hand, if they are connected to the bottom NMOS, the channel is



Fig. 10. Schematic diagram of the 2.5 GHz low hysteresis comparator.

always formed, and the PMOS of the NAND gate is completely off. Therefore, there is much less data dependent load capacitance.

This kind of consideration was not necessary in conventional flash ADCs since the comparator offset is being overcome by brute-force signal preamplification. It may even be preferable in terms of regeneration speed to use the top NMOS rather than the bottom one as the slave latch input in such case.

It may be argued that inverter buffer stages can be inserted between the comparator and slave latch to rid of hysteresis. However, with a single inverter, either cross coupled NOR gate or PMOS input comparator must be used instead, increasing propagation delay and power consumption. If two inverters are used, to maintain optimal fan-out, the cross-coupled NAND gates must be much larger and have to drive much larger load which would otherwise be driven much more efficiently by inverter buffers.

The comparator uses digital VDD/VSS. This is because the (single-ended) latch signal comes from the clock buffers and hence the charge current comes from digital VDD/VSS. If the comparator were connected to analog VDD/VSS, then every time the latch signal goes up or down, current must flow from digital VDD to analog VSS or digital VSS to analog VDD. This means high frequency current must either pass through package bond wires or deep-n-well isolation which is reverse biased p-n junctions. The result is voltage bounce on the AVSS line, al-though periodic and signal independent, may slow down the comparator since the package inductance and PCB trace will



Fig. 11. Photograph of the entire die (including PLL and SRAM) with the ADC portion enlarged.

come between the clock driver and the latch transistors (the AVSS will bounce up with "latch" signal goes up, hence slowing down the turn-on of the tail NMOS switch).

On the other hand, this problem is much less severe if the comparator uses digital VDD/VSS. This time, the comparator receives signal Vin+ and Vin- from the preamp which uses analog VDD/VSS. Since they are differential, only common-mode current goes between analog and digital supply.

The disadvantage of using digital VDD/VSS for comparator is that there are much signal dependent disturbance on digital VDD/VSS unless much decoupling capacitor is used for the digital supplies. These signal dependent disturbance result in time-varying offset in the comparator, which cannot be compensated by the background offset self-calibration loop. As can be seen in the die photo (Fig. 11) the digital VDD decoupling capacitor (realized by NMOS inversion mode gate capacitor) occupies almost twice as much area as the digital circuit block does.

# E. Floor Plan and Layout Considerations

Fig. 11 shows die photographs of the chip. To achieve 400 ps cycle time it is essential to minimize layout parasitics, which requires the layout to be as compact as possible. Signals with the highest frequency components, in this case, clock signals, should have to go as little distance as possible to reduce loading and power consumption. Therefore, the clock buffer is placed at the center of two copies of the time-interleaved ADC, such that switches of both ADCs can be placed immediately next to it. Followed by the switches are the capacitor networks, beyond which lie the 3 preamps and comparators. The SAR logic lies across the comparator back to the switches, such that physical lengths of the digital signal paths are minimized.

The ADC is placed at the upper center of the die, such that the signal and VDD/VSS bond wires can be as short as possible. The ADC's output travels downwards the die to the SRAM and the two 8-PAM transmitters. Inverter buffers are inserted every  $\sim$ 



Fig. 12. Schematic diagram of the 8-PAM transmitter.



Fig. 13. Snippet of captured 8-PAM transmitter output by TDS694C (1.25 GS/s, 20 MHz input signal).

 $300 \,\mu\text{m}$  to improve signal integrity. For every long single-ended signal wiring, metal-7 (DVSS) has been laid above the wiring to provide return path and reduce electromagnetic emission.

The PLL is placed at the upper-right corner of the die. The 2.5 GHz clock is routed from the PLL to ADC using differential metal-6 wiring with metal-1 (DVSS) shield. The shield increases isolation from substrate while adding little parasitic capacitance. To keep high edge-slope, the spacing between the plus and minus wiring must be large to reduce parasitic capacitance. However, parasitic inductance increases due to larger area enclosed by the current loop. To prevent magnetic interference, the wirings are twisted every 100  $\mu$ m, such that the magnetically induced emf is reversed. This not only cancels mutual inductance hence injecting and receiving much less noise to/from the environment, but also reduces slightly the self inductance that leads to peaking.



Fig. 14. Simplified block diagram of the test setup.

#### **IV. TESTING ISSUES**

#### A. Capturing ADC Output Data

In real applications these high speed ADCs are almost always integrated with DSP, therefore there is no need to transmit the output data stream off-chip. However, we need to test the ADC by itself in this research. Capturing 1.25 GS/s 6-bit data directly poses a problem since very expensive logic analyzers would be needed. Even before this, assuming standard LVDS output for each bit, then 12 pads would be needed. These pads could otherwise be used to reduce package parasitics for important pins such as VDD/VSS.

The proposed scheme uses two 8-level pulse amplitude modulation (PAM) transmitters which are simply two 3-bit DACs with 50  $\Omega$  output impedance. Fig. 12 shows simplified schematics of each transmitter. To save pads, the transmitter outputs automatically become high-impedance when an externally given bias current is shut down, such that it can share the same pad with other high impedance CMOS input pads for the SRAM, which is shut down when the 8-PAM transmitter is enabled and vice versa. A large PMOS switch is used to connect and disconnect the 50  $\Omega$  on-chip load resistor and VDD.

Each transmitter output is routed on the PCB with 100  $\Omega$  differential controlled-impedance trace for about 3 inch before reaching SMA connectors. One side is terminated and the other side is connected to the oscilloscope. AC coupling capacitors are used for avoiding extra DC current to maximize signal swing. Fig. 13 shows a captured transmitter output waveform by Tektronics TDS694C, a 10 GS/s, 3 GHz bandwidth digital oscilloscope. The input is a 20 MHz near full scale sine wave. The ADC is clocked at 1.25 GS/s so each symbol is oversampled by 8. The record depth of TDS694C is 120 K so approximately 15000 samples are captured each time for 1.25 GS/s.

# B. Test Setup

Fig. 14 shows a simplified block diagram of the test setup for the ADC.

The transformer (ADTL-18 from minicircuits) has a bandwidth of 1.8 GHz. Above 2 GHz the attenuation becomes so large that the ADC no longer receives any clock. Therefore although the on-chip PLL can be disabled to allow clock given externally for < 1 GS/s, the multiply-by-8 PLL has been used for all measurements. The PC board includes three linear regulators to generate 1.2 V (used by ADC and digital part of PLL), 1.5 V (used by analog part of PLL) and 3.3 V (used to bias ESD protection diodes in order to reduce capacitance, and externally adjustable current sources). 0.22  $\mu$ H SMT inductors have been used to isolate analog and digital supplies that come from the same regulator. Multiple 0.1  $\mu$ F 0603 SMT capacitors are placed in parallel close to the DUT to reduce power supply noise.

# V. EXPERIMENTAL RESULTS

#### A. Summary of Results and Discussions

Fig. 15 shows FFT of captured ADC output data at 60 MHz 0 dBFS input and 453 MHz -3 dBFS input.<sup>3</sup> The "full-scale" used in this section is based on the ADC digital output, i.e., 0 dBFS means when the maximum amplitude is 63. At DC, 0 dBFS corresponds to about 1.2 Vpp differential (0.3 V to 0.9 V each side) at ADC input. Larger amplitude input signal is required at higher signal frequencies due to attenuation at the T/H.

Thanks to the single front-end T/H that removes the sampling timing mismatch between the two ADCs and good matching

<sup>3</sup>These input amplitudes give the maximum SNDR.



Fig. 15. ADC output FFT (14800 pts) at 1.25 GS/s.

of the capacitor networks, tones at fs/2 and fs/2-fin are below -50 dBc and do not affect SNDR. Therefore, unlike many previous published experimental high-speed low-power ADCs [3], [7], [8], the proposed ADC does not require any external calibration procedures or digital correction, making it a practical, drop-in replacement for flash ADCs to reduce power by more than 4 times. The only requirement is a  $2\times$  clock, which is readily available in many applications where the flash ADC blocks are themselves time-interleaved.

Fig. 16 shows SNDR, SFDR, and SNR (calculated with the first 5 harmonics removed) versus input signal power (arbitrary reference) for 230 MHz and 1199 MHz input signals.

Fig. 17 is a graph that summarizes the maximum SNDR and SFDR for each input signal frequency and different sampling clock frequency. As signal frequency increases, the signal amplitude that gives the maximum SNDR decreases due to the effect of nonlinear parasitic capacitances at both the input and output of the T/H circuit which degrade linearity. Two or three measurements are taken per each point with different input signal amplitude close to where the SNDR is expected to peak, and SNDR/SFDR for the amplitude that gives the maximum SNDR are plotted.

Up to 48 dB SFDR and 36.4 dB SNDR for low frequency inputs demonstrate the level of capacitor matching and linearity



Fig. 16. Measured SNDR, SNR and SFDR versus input signal power.



Fig. 17. Measured SNDR and SFDR versus input signal frequency.

of the 2b quantizer with offset self-calibration.  $\geq$  32 dB SNDR is maintained up to 450 MHz input frequency.

The small unit capacitance 5 fF of the capacitor network reduces power consumption and also improves linearity since smaller T/H can be used, presenting less nonlinear parasitic capacitance at the input. Nevertheless, due to the relatively large total capacitance (240 fF + bottom plate parasitics) the front-end T/H must drive, measured results (Fig. 17) show that linearity degrades when signal frequency  $f_{IN}$  is close to half the sampling frequency.

The measured SNDR and SFDR performance close to fs/2 are below 28 dB and 35 dB respectively, which is not as good as some of the state-of-the-art 6-bit flash ADCs. Interestingly, as  $f_{IN}$  is increased further to 1199 MHz, the SNDR and SFDR worsens but not as much as predicted from linear extrapolation of the SNDR, SFDR versus  $f_{IN}$  curve between 500 ~ 625 MHz (Figs. 16 and 17). This can be explained by noting that the difference between the current and previous sample is smaller at 1199 MHz than at 571 MHz, hence the charge T/H must deliver is actually less. This also partially verifies that the large

| Technology         | UMC $0.13\mu m$ CMOS                     |
|--------------------|------------------------------------------|
| Resolution         | 6 bits                                   |
| Input range        | 1.2Vpp differential                      |
| Power Supply       | 1.2V analog, 1.2V digital                |
| Sampling frequency | 1.25GHz                                  |
| Latency            | 2 clock cycle (1.6ns)                    |
| Power Consumption  | $\leq 32 \mathrm{mW}$                    |
| ENOB               | $5.8@1$ GS/s, $f_{IN}=20$ MHz            |
|                    | $5.5@1.25GS/s, f_{IN}=20MHz$             |
|                    | $5.0@1.25GS/s, f_{IN}=453MHz$            |
| Input capacitance  | $\leq 100$ fF each side                  |
| Active area        | 0.09mm <sup>2</sup> including decoupling |
|                    | caps                                     |
| Die size           | $1525\mu m \times 1525\mu m$             |
| Package            | TQFP 64 pin, $10mm \times 10mm$          |

TABLE II Performance Summary

# TABLE III COMPARISON WITH OTHER PUBLISHED $\geq 1$ GS/s 6-BIT CMOS ADCs

| Year | Fs   | ENOB        | ERBW  | Input              | Power | Technology   | Ref. |
|------|------|-------------|-------|--------------------|-------|--------------|------|
|      | (MS/ | s)          | (MHz) | capac-             | (mW)  |              |      |
|      |      | ,           | , ,   | itance             |       |              |      |
| 2001 | 1100 | 5.3         | 450   | N/A                | 300   | $0.35 \mu m$ | [16] |
|      |      | 5.7(900MS   | /s)   |                    |       |              |      |
| 2001 | 1300 | 5.0         | 650   | >1 pF              | 545   | $0.35 \mu m$ | [14] |
| 2002 | 1600 | 5.7         | 450   | N/A                | 328   | $0.18 \mu m$ | [17] |
| 2003 | 2000 | 5.7         | 600   | N/A                | 310   | $0.18 \mu m$ | [18] |
| 2005 | 1200 | 5.7         | 700   | 0.4pF              | 160   | $0.13 \mu m$ | [2]  |
| 2006 | 1000 | 5.5         | 500   | $>240 \mathrm{fF}$ | 55    | 90nm         | [15] |
| 2008 | 1250 | 5.5         | 450   | $0.1 \mathrm{pF}$  | 32    | $0.13 \mu m$ | This |
|      |      | 5.8(1 GS/s) |       |                    |       |              | work |

load on the T/H is causing the performance degradation near fs/2. Improvements can be made by using larger T/H (consumes more power and presents larger ADC input capacitance), by using a CMOS process with higher switch transistor  $f_T$  such that smaller switches with less parasitics can be used, or by using a process with low-k dielectric such that even smaller unit capacitance (Cu) can be used without reducing the physical size of the capacitor (hence maintaining matching.)

#### B. Performance Summary and Comparison

The two time-interleaved ADCs, reference buffers and T/H draw 18.5 mA from a 1.2 V AVDD. The clock buffers and the SAR logic draw 8 mA at 1.25 GS/s from a 1.2 V DVDD. The total power consumption is 32 mW. Table II provides a performance summary.

To give an objective evaluation of these results, Table III gives a comparison with other 6-bit high-speed A/D converters with no off-line calibration published in recent years which are representative of the state of the art in CMOS. As technology advances, we can see that the sampling frequency increases and power tends to decrease. However, the lowest reported power for traditional flash ADC is 160 mW. A two-step subranging ADC reports 55 mW power consumption, but this does not include the track-and-hold (ADC driver), hence presenting larger ADC input capacitance.

#### VI. SUMMARY

A new type of high-speed 6-bit ADC has been proposed by combining the successive approximation with the flash architecture. By taking advantage of small-value, highly-matched metal interconnect capacitor available in standard nanometer digital CMOS processes, a proto-type ADC achieved 6-bit resolution without calibration at 1.25 GS/s with 32 mW total power consumption, the lowest reported to date. Comparable dynamic performance as other published flash ADC has also been measured.

Low power consumption is the direct result of more than 10 times less comparators and much simpler thermometer-to- binary encoders than conventional 6b flash architectures (3 fulladders versus 57 full-adders in a 6b Wallace-tree encoder).

The SAR ADC uses 2.5 GHz clock frequency, which is the highest reported internal clock frequency for a CMOS SAR ADC to date. High clock frequency is realized thanks to (i) the availability of small-size but highly matched metal-interconnect capacitors; (ii) low parasitic capacitances and on-resistance of the switches; (iii) reference buffers with very low DC and AC output impedances; (iv) a proposed digital offset calibration scheme that removes the need for multi-stage preamplifiers; (v) a proposed flip-flop bypass SAR logic; and (vi) high-speed digital logic available in nanometer CMOS process in general.

The potential of passive switched-capacitor circuits for high frequency applications in advanced digital CMOS processes, where the quality of switches and capacitors only improves every time the process scales down, has been demonstrated in this paper.

#### ACKNOWLEDGMENT

The authors thank Europractice for their mini@sic program and Texas Instruments for help with measurements.

#### REFERENCES

- D. Cabric *et al.*, "Future wireless systems: UWB, 60 GHz, and cognitive radios," in *Proc. IEEE Custom Integrated Circuits Conf.*, 2005.
- [2] C. Sandner, M. Clara, A. Santner, T. Hartig, and F. Kuttner, "A 6 bit, 1.2 GSps low-power flash-ADC in 0.13 μm digital CMOS," *IEEE J. Solid-State Circuits*, vol. 40, no. 7, pp. 1499–1505, Jul. 2005.
- [3] G. Van der Plas *et al.*, "A 0.16 pJ/conversion-step 2.5 mW 1.25 GS/s 4b ADC in a 90 nm digital CMOS process," in *IEEE ISSCC Dig. Tech. Papers*, 2006, p. 2310.
- [4] C.-Y. Chen, M. Le, and K. Y. Kim, "A low power 6-bit flash ADC with reference voltage and common-mode calibration," in *Symp. VLSI Circuits Dig.*, 2008, pp. 12–13.
- [5] A. Verma and B. Razavi, "Frequency-based measurement of mismatches between small capacitors," in *Proc. IEEE Custom Integrated Circuits Conf.*, 2006, pp. 481–484.
- [6] Z. Cao, S. Yan, and Y. Li, "A 32 mW 1.25 GS/s 6b 2b/step SAR ADC in 0.13 μm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, 2008.
- [7] D. Draxelmayr, "A 6b 600 MHz 10 mW ADC array in digital 90 nm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, 2004.
- [8] S.-W. M. Chen and R. W. Brodersen, "A 6b 600 MS/s 5.3 mW asynchronous ADC in 0.13 μm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, 2006, pp. 574–575.
- [9] B. P. Ginsburg and A. P. Chandrakasan, "Dual time-interleaved successive approximation register ADCs for Ultra-Wideband receiver," *IEEE J. Solid-State Circuits*, vol. 42, no. 2, pp. 247–257, Feb. 2007.
- [10] M. Horowitz, E. Alon, D. Patil, S. Naffziger, R. Kumar, and K. Bernstein, "Scaling, power, and the future of CMOS," in *Int. Electron De*vices Meeting (IEDM) Dig., 2005.
- [11] S. M. Jamal, D. Fu, P. J. Hurst, and S. H. Lewis, "A 10b 120 MSample/s time-interleaved analog-to-digital converter with digital background calibration," in *IEEE ISSCC Dig. Tech. Papers*, 2002.
- [12] Z. Cao, Y. Li, and S. Yan, "A 0.4 ps-rms-jitter 1-3 GHz ring-oscillator PLL using phase-noise preamplification," in *Symp. VLSI Circuits Dig.*, 2008, pp. 114–115.
- [13] B. P. Ginsburg and A. P. Chandrakasan, "Dual scalable 500 MS/s, 5b time-interleaved SAR ADCs for UWB applications," in *Proc. IEEE Custom Integrated Circuits Conf.*, 2005.
- [14] M. Choi and A. A. Abidi, "A 6b 1.3 Gsample/s A/D converter in 0.35 μm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, 2001, pp. 126–127.

- [16] G. Geelen, "A 6b 1.1 Gsample/s CMOS A/D converter," in *IEEE ISSCC Dig. Tech. Papers*, 2001, pp. 128–129.
- [17] P. Scholtens and M. Vertregt, "A 6b 1.6 GSample/s flash ADC in 0.18 μm CMOS using averaging termination," in *IEEE ISSCC Dig. Tech. Papers*, 2002, pp. 168–169.
- [18] X. Jiang, Z. Wang, and M. F. Chang, "A 2 GS/s 6b ADC in 0.18  $\mu$  m CMOS," in *IEEE ISSCC Dig. Tech. Papers*, 2003.



**Shouli Yan** (M'98) received the Ph.D. degree in electrical engineering from Texas A&M University, College Station, in 2002. He received the B.S. degree in electronic engineering and the M.S. degree in computer science and engineering, both from Shanghai Jiao Tong University, China, in 1992 and 1995, respectively.

He has been an Assistant Professor in the Department of Electrical and Computer Engineering, University of Texas at Austin, since 2003. During his graduate study from 1997 to 2002 at Texas A&M, he

was a Graduate Research and Teaching Assistant at the Department of Electrical Engineering. In Fall 1999, he interned at Analog and Mixed-Signal Division of Texas Instruments, Inc., Dallas, TX, working on low-power low-distortion audio power amplifier design. He was with Silicon Laboratories, Inc., Austin, in Summer 2003. His research interests include low-power and high-speed A/D and D/A converters, RF circuits, and high-performance analog and mixed-signal integrated circuits.



**Zhiheng Cao** was born in Chengdu, China, in 1981. He received the B.E. degree in electrical engineering from the University of Tokyo, Japan, in March 2004, and the Ph.D. degree in electrical and computer engineering from the University of Texas at Austin in December 2007.

In 2006 he interned with the High Speed Converter group at Analog Devices, Wilmington, MA, and in 2007 he worked as a co-op design engineer in the High Performance Analog group in Texas Instruments, Dallas, TX. His research interests include

architecture and circuit design for RF and mixed-signal integrated circuits.



**Yunchu Li** (M'02) was born in Hunan, China, in 1972. He received the B.S. and M.S. degrees in electronic engineering from the University of Science and Technology of China in 1994 and 1997, respectively. He received the Ph.D. degree in electrical engineering from Texas A&M University in 2003.

In 1999, he was a co-op design engineer in the data converter group in Texas Instruments, Dallas, TX. Since July 2002, he has been with High Speed Signal Processing group at Analog Devices, Wilm-

ington, MA. His current research interests are in high-performance and highspeed converters and digital signal processing.