Copyright
by
Chaoming Zhang
2008
The Dissertation Committee for Chaoming Zhang certifies that this is the approved version of the following dissertation:

**Built-In Self Test of RF Subsystems**

Committee:

Jacob A. Abraham, Supervisor

Arjang Hassibi

Ranjit Gharpurey

David Z. Pan

John X.J. Zhang
Built-In Self Test of RF Subsystems

by

Chaoming Zhang, B.S., M.S.

DISSERTATION
Presented to the Faculty of the Graduate School of
The University of Texas at Austin
in Partial Fulfillment
of the Requirements
for the Degree of

DOCTOR OF PHILOSOPHY

THE UNIVERSITY OF TEXAS AT AUSTIN
December 2008
Dedicated to my parents.
Acknowledgments

It is with sincerest gratitude that I thank my advisor Prof. Jacob A. Abraham for the challenging and exciting research topics he suggests, and for countless insightful comments and guidance he gives me. He has always been a constant source of inspiration and support throughout my PhD study. I would like to thank Prof. Ranjit Gharpurey and Prof. Arjang Hassibi for their supervision of my projects with their outstanding knowledge and experience. I would also like to thank Prof. David Z. Pan and Prof. John X.J. Zhang for serving on my committee and providing insightful feedback on my research.

I would like to thank Dr. Oliver Werther for giving me the opportunity to do a summer internship at Alereon. I am also grateful to Mr. Frank Singor for guidance and support during my internship and work at Broadcom. I thank my colleagues Jingyu Hu and Jie Fang for valuable discussions on circuit design skills.

I would like to thank Dr. Kang Luo, Feng Zhou and Lijuan Zhang for helping me with wire bonding and test. Without your help, the chip results couldn’t come out.

I should thank the students who have worked together with me in the big lab room for the friendship and constant support. Romi, Sankar,
Sriram, Rajeshwary, Joonsung, Qingqi, YuQing Yang, TungYeh, etc. They are wonderful people and provide me with many useful references and friendly encouragement. I should also thank Andrew Kieschnick for all the network and CAD support during my study in CERC. I would like to thank Debi and Melissa for all the administrative help.

I also benefited a lot from discussions with Zhiheng Cao and Tongyu Song, who are excellent designers and good friends to me. I also thank Junghwan Han and Brandon for their project cooperation.

I wish to thank my friends Haipeng Wei, Xi Chen, Wen Li, Yang Liu, Yang Zhang, Li Zhou, Xiaonan Chen, Hong Luo and Hong Xu for all the good and bad times we had together here in UT. I also thank Shujie Chen, Gregor Hamme, Ru Zhao, Jia Tian, Ting Li and Liang Han ... for your friendship over years.

Finally, I am grateful to my parents for their patience, love and support. Without them this work would never have come into existence.
Built-In Self Test of RF Subsystems

Publication No. ________________

Chaoming Zhang, Ph.D.
The University of Texas at Austin, 2008

Supervisor: Jacob A. Abraham

With the rapid development of wireless and wireline communications, a variety of new standards and applications are emerging in the marketplace. In order to achieve higher levels of integration, RF circuits are frequently embedded into System on Chip (SoC) or System in Package (SiP) products. These developments, however, lead to new challenges in manufacturing test time and cost. Use of traditional RF test techniques requires expensive high frequency test instruments and long test time, which makes test one of the bottlenecks for reducing IC costs.

This research is in the area of built-in self test technique for RF subsystems. In the test approach followed in this research, on-chip detectors are used to calculate circuits specifications, and data converters are used to collect the data for analysis by an on-chip processor. A novel on-chip amplitude detector has been designed and optimized for RF circuit specification test. By using
on-chip detectors, both the system performance and specifications of the individual components can be accurately measured. On-chip measurement results need to be collected by Analog to Digital Converters (ADCs). A novel time domain, low power ADC has been designed for this purpose. The ADC architecture is based on a linear voltage controlled delay line. Using this structure results in a linear transfer function for the input dependent delay. The time delay difference is then compared to a reference to generate a digital code.

Two prototype test chips were fabricated in commercial CMOS processes. One is for the RF transceiver front end with on-chip detectors; the other is for the test ADC. The 940MHz RF transceiver front-end was implemented with on-chip detectors in a 0.18μm CMOS technology. The chips were mounted onto RF Printed Circuit Boards (PCBs), with tunable power supply and biasing knobs. The detector was characterized with measurements which show that the detector keeps linear performance over a wide input amplitude range of 500mV. Preliminary simulation and measurements show accurate transceiver performance prediction under process variations. A 300MS/s 6 bit ADC was designed using the novel time domain architecture in a 0.13 μm standard digital CMOS process. The simulation results show 36.6dB Signal to Noise Ratio (SNR), 34.1dB Signal to Noise and Distortion Ratio (SNDR) for 99MHz input, Differential Non-Linearity (DNL)<0.2 Least Significant Bit (LSB), and Integral Non-Linearity (INL)<0.5LSB. Overall chip power is 2.7mW with a 1.2V power supply.

The built-in detector RF test was extended to a full transceiver RF
front end test with a loop-back setup, so that measurements can be made to verify the benefits of the technique. The application of the approach to testing gain, linearity and noise figure was investigated. New detector types are also evaluated. In addition, the low-power delay-line based ADC was characterized and improved to facilitate gathering of data from the detector. Several improved ADC structures at the system level are also analyzed. The built-in detector based RF test technique enables the cost-efficient test for SoCs.
# Table of Contents

Acknowledgments v

Abstract vii

List of Tables xiii

List of Figures xiv

Chapter 1. Introduction 1

Chapter 2. RF Built-in Test Method with Amplitude Detectors 6

2.1 Test Theory .................................................. 6
  2.1.1 Optimized RF Detector ............................. 6
  2.1.2 Test Theory for the Amplitude Detector .......... 10

2.2 Mixer Test  .................................................. 14
  2.2.1 Simulation Results .................................. 16
  2.2.2 Prediction Accuracy .................................. 17

2.3 RF Receiver Front-End Test  ................................ 21
  2.3.1 Prototype Chip ........................................ 21
  2.3.2 Test Setup and Measurement Procedure .......... 23
  2.3.3 Measurement Results ................................ 26
  2.3.4 Results of Directly Solving Equations .......... 26

2.4 RF Built-In Self Test with Detector Array Multiplexing Loopback ........................................ 32
  2.4.1 Introduction ............................................ 32
  2.4.2 Chip Design ............................................ 35
  2.4.3 Test Setup and Measurement Result ............... 38

2.5 Noise Test with Built-in Detector ......................... 48
2.6 Online Corrections with Detectors ........................................... 52
2.7 Other Built-in RF Detectors .................................................. 54
   2.7.1 Average Detector ...................................................... 54
   2.7.2 Integrated Envelope Detector ....................................... 56
2.8 Conclusions ................................................................. 58

Chapter 3. ADC Based on Linear Voltage Controlled Delay Line .......... 60
3.1 Introduction ................................................................. 60
   3.1.1 Overview of ADC structures ....................................... 61
   3.1.2 Time Domain ADC .................................................... 63
3.2 Prototype Chip Design ....................................................... 64
   3.2.1 Voltage to Delay Building Block .................................. 66
   3.2.2 Delay Comparator .................................................... 68
   3.2.3 Wallace Tree Structure ............................................. 70
3.3 Layout and Simulation Results ............................................. 72
3.4 Chip Measurement ........................................................... 74
3.5 Solutions to Non-linearity Contributors ................................ 80
   3.5.1 Supply Voltage ....................................................... 80
   3.5.2 Comparator Offset ................................................... 82
3.6 Jitter Reduction and Measurement ....................................... 83
   3.6.1 Introduction ........................................................... 84
   3.6.2 Prototype Scheme .................................................... 87
   3.6.3 Circuit Implementation .............................................. 89
      3.6.3.1 Edge Detector .................................................. 89
      3.6.3.2 Sign Bit Generation Circuit .................................. 89
      3.6.3.3 Oscillator ..................................................... 91
      3.6.3.4 Phase Detector and Feedback ................................. 92
   3.6.4 Calibration and Simulation Results ................................. 92
      3.6.4.1 Calibration ..................................................... 93
      3.6.4.2 Simulation Results ............................................. 94
   3.6.5 Discussions ........................................................... 96
   3.6.6 Conclusions .......................................................... 97
3.7 Improved Designs .................................................. 98
  3.7.1 Robust ADC ................................................. 98
  3.7.2 Time Sub-ranging ADC ................................. 100
  3.7.3 Pipeline ADC ............................................. 104
3.8 Conclusions ....................................................... 105

Chapter 4. Conclusions and Future Directions 106
  4.1 Future Directions ............................................ 107

Bibliography .......................................................... 110

Vita ................................................................. 120
List of Tables

<table>
<thead>
<tr>
<th>Section</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>2.1</td>
<td>Test Accuracy for Down Conversion Mixer</td>
<td>20</td>
</tr>
<tr>
<td>2.2</td>
<td>Test Accuracy for Noise Robustness</td>
<td>20</td>
</tr>
<tr>
<td>2.3</td>
<td>Test Accuracy for Receiver and Components</td>
<td>30</td>
</tr>
<tr>
<td>2.4</td>
<td>Calculation Test Accuracy for Receiver</td>
<td>32</td>
</tr>
<tr>
<td>2.5</td>
<td>Loop-back TX Test Accuracy</td>
<td>48</td>
</tr>
<tr>
<td>2.6</td>
<td>Loop-back RX Test Accuracy</td>
<td>48</td>
</tr>
<tr>
<td>3.1</td>
<td>Linearity Variation</td>
<td>78</td>
</tr>
</tbody>
</table>
# List of Figures

2.1 Circuit of improved RF detector ........................................... 8
2.2 Detector characterization curve with 940MHz sine signal input amplitude sweeping ................................................... 10
2.3 Detector output frequency response for LNA test, input -3dBm two tones 939.9MHz and 940.1MHz ................................. 13
2.4 Circuit of Gilbert mixer with design improvements .............. 15
2.5 Specification prediction of conversion gain ....................... 18
2.6 Specification prediction of third order intercept point ......... 19
2.7 Receiver RF front end circuit blocks ............................. 21
2.8 LNA with inductive source degeneration ......................... 22
2.9 The chip die photo of the receiver RF frontend in UMC 0.18μm CMOS technology ...................................................... 23
2.10 Measurement setup ......................................................... 24
2.11 Results of instrument measurement and detector calculation for (a) Receiver gain (b) Receiver IIP3 ................................. 27
2.12 Results of instrument measurement and detector calculation for (a) LNA gain (b) LNA IIP3 ........................................... 28
2.13 Results of instrument measurement and detector calculation for (a) Mixer gain (b) Mixer TOI ........................................ 29
2.14 Results of instrument measurement and equation calculation for (a) RX gain (b) RX IIP3 ................................................ 31
2.15 Loopback test of RF transceiver front-end with detector array and on-chip ADC ......................................................... 34
2.16 Up conversion mixer ....................................................... 35
2.17 Pre amplifier .................................................................. 36
2.18 Die photo with components marked .................................. 37
2.19 Custom printed circuit board for loopback test .................. 39
2.20 Simplified block diagram of loopback test setup .............. 40
2.21 Loopback results of instrument measurement and detector calculation for (a) Up mixer gain (b) Up mixer TOI .............. 42
2.22 Loopback results of instrument measurement and detector calculation for (a) Pre Amp gain (b) Pre Amp IIP3  
2.23 Loopback results of instrument measurement and detector calculation for (a) Transmitter path gain (b) Transmitter IIP3  
2.24 Loopback results of instrument measurement and detector calculation for (a) LNA gain (b) LNA IIP3  
2.25 Loopback results of instrument measurement and detector calculation for (a) Down mixer gain (b) Down mixer TOI  
2.26 Loopback results of instrument measurement and detector calculation for (a) RX gain (b) RX IIP3  
2.27 Pre-amp noise figure simulation results  
2.28 Test setup of noise figure measurement  
2.29 Pre-amp noise figure measurement results  
2.30 Feedback control flow chart  
2.31 Average detector  
2.32 Average detector characterization curve  
2.33 Envelope detector  
2.34 Envelope detector output with two tones 1GHz and 100MHz signals as inputs  
3.1 Architecture of the proposed ADC  
3.2 Circuit topology of the N-block and P-block voltage-dependant variable delay blocks  
3.3 Interleaved N-block and P-block forming the voltage-controlled variable-delay transmission line  
3.4 Input-delay transfer function of the voltage-controlled variable-delay transmission line  
3.5 Delay comparator schematic  
3.6 6 bit Wallace tree connections  
3.7 The ADC layout in a 0.13μm digital CMOS process  
3.8 Results of simulated ADC DC performance (a) DNL (b) INL  
3.9 The dynamic performance of the implemented ADC with a 99MHz input source  
3.10 Die photo of the ADC, with function blocks marked out  
3.11 Custom printed circuit board for ADC test
3.12 ADC test setup .................................................. 76
3.13 The measured voltage to delay transfer curve under different supply voltages ........................................... 77
3.14 Delay pulse edge time distribution on Tektronix digital oscilloscope ............................................................ 78
3.15 Analog input sweeping ........................................... 79
3.16 Bandgap voltage reference ...................................... 81
3.17 Bandgap voltage reference simulated result ............... 82
3.18 Vernier delay line sampler ...................................... 85
3.19 Vernier oscillator measurement circuit ....................... 86
3.20 Bi-directional jitter measurement ............................. 88
3.21 Edge detector ...................................................... 89
3.22 (a) Symmetric NOR gate (b) Symmetric NAND gate ....... 90
3.23 Leading and lagging edge selection and sign bit generation logic .......................................................... 91
3.24 Triggered oscillator ............................................... 91
3.25 Phase detector ...................................................... 92
3.26 Measurement waveforms, from the top to the bottom: signal triggered edge, clock triggered edge, counter input and phase detector output .................................................. 95
3.27 Jitter histograms .................................................... 96
3.28 Robust ADC ........................................................ 99
3.29 Time sub-ranging ADC .......................................... 101
3.30 A 6-bit ADC using time sub-ranging architecture ........ 102
3.31 Digital control block for stage bit generation and pulse selection .......................................................... 104
3.32 Pipeline ADC ....................................................... 105
Chapter 1

Introduction

The rapid development of wireless and wireline communications introduces more and more new standards and new applications into the marketplace. Higher levels of integration are critical to reducing die area, power consumption and board complexity. Thus, RF circuits are more frequently embedded into System on Chip (SoC) or System in Package (SiP) products. These developments, however, lead to new challenges in manufacturing test time and cost [1].

High integration implies more signals pins need to be tested. Especially for RF signals, special high speed pins are required. Different from digital circuit test, which is mostly logic pass/fail judgement, RF test includes gain, linearity and noise specifications. In RF test, sweeping of power levels and signal frequencies are common. Use of traditional RF test techniques requires the use of expensive high frequency test instruments and long test times, which makes test one of the bottlenecks for reducing integrated circuit (IC) costs.

Many solutions have been proposed in order to solve these problems. One solution is to use special stimulus waveforms as input signals, and the captured responses are processed to predict circuit performance parameters [2][3][4].
However, complicated algorithms are required to search for the proper input signals, and expensive instruments are needed to obtain the test responses across the spectrum of operation. In addition, computation overhead is also introduced for post data processing.

Another direction is built-in test, which puts additional test circuitry into the Device Under Test (DUT) at selected measurement points, so that the output of these simple circuits can be captured by low cost test instruments or on-chip circuitry. Different types of circuits including RMS detectors [5], power detectors [6], amplitude detectors [7] etc., have been proposed. However, the detectors in all aforementioned systems were only used to monitor the simple outputs at specific circuit points. The ability to predict more complicated RF circuit specifications, such as gain and linearity have not been fully investigated. In [8], a high bandwidth on-chip detector was designed to test the gain and linearity for RF building blocks, using detailed input power sweeping to extract linearity parameters. However, only simulation results were presented for the detector and no RF circuits were implemented or measured.

Recent approaches involve using the output of a detector with specific stimulus at the input to predict the RF circuit specifications, including gain, Third Order Input Intercept Point (IIP3) and noise figure. However, most of the related test papers only used the low noise amplifier (LNA) circuit as the DUT because of its simplicity [9][10][11][12]. Other research has used behavior-level system models with printed circuit board level test to study the overall transmitter and receiver performance [13][14].
In one recent publication[15], a built-in self testing of a Digital RF Processor (DRP)-based Global System for Mobile communications (GSM) transmitter by using fully digital hardware was also demonstrated. However, this technique is highly dependent on the specific digital RF design, and focused on detecting defects. No method for calculating specifications was described.

In part because transceiver circuits involve frequency shifting in the mixer, which introduces complexity in design, simulation and test, no published work in the past has discussed full circuit level built-in specification tests for RF transceivers. Most of the previous test methods can only obtain overall system performance, without extracting the specifications of the discrete components in the chain. In [16], an improved RF amplitude detector was designed to support RF mixer test; however, only the mixer was considered, and only simulation results were provided.

As part of this PhD work, in [17], the detectors were implemented on-chip in a receiver to facilitate specification test for both the system and the discrete components, with hardware measurement results validating this method. In this dissertation, a more detailed analysis is applied to this test method, and a loopback scheme for transceiver test is proposed and validated with experimental measurements.

If a traditional two-tone input is applied to a RF component, the detector output response is a very low frequency signal. The detector output DC level is proportional to the receiver output amplitude, and the detector output low frequency components contains intermodulation information, from
which the circuit gain and IIP3 can be deduced. The detector output can be
easily sampled by an on-chip analog-to-digital converter (ADC). By adding
additional amplitude detectors at selected nodes along the transceiver chain,
the specifications of the different components can be measured.

By connecting the output of the transmitter (TX) to the input of the re-
ceiver (RX), we can perform loopback testing, which removes the requirement
of conventional sequential block test, and saves significant testing time. How-
ever, loopback testing suffers from fault masking. Over-designed blocks can
mask faulty blocks. Their individual performance parameters cannot be dis-
criminated from just loopback measurement. This results in potential serious
yield loss and low test accuracy. In this research, with detectors distributed on
different component nodes, no masking problem exists any more. The perfor-
ance measures of all the individual modules in the system can be evaluated
at the same time.

An RF frontend test chip in 0.18\textmu\textit{m} digital CMOS technology was
designed, fabricated and tested. In this work, we show that both the system
performance and the individual module parameters can be deduced accurately
from the fast Fourier transform (FFT) of the sampled detector outputs.

To make an on-chip RF test system, an ADC is a necessary component.
In order to simplify the control and save power and area, an on-chip ADC
dedicated to collecting RF detector data is a promising option. In contrast with
traditional ADCs, which use voltage or current to perform the intermediate
transition to a digital signal, this work proposes a way of designing in the

4
time domain. The analog input is first transferred into time pulses, then the quantization work is done using the time value. In this way, the technique takes full advantage of digital processes, and achieves high speed with small area and low power consumption. A prototype ADC was fabricated in a digital 0.13\(\mu\)m CMOS process and the performance parameters were characterized. Further improvements are investigated, and several systematic innovations are designed and evaluated.

The key contributions of this work are the theory and implementation of the technique of using an improved on-chip detector to transfer the high frequency output signals into DC and low frequency components, without losing specification information. Additional features of the approach which are attractive include low area overhead, low resolution ADC requirements, and simple stimulus signals for the test. The theory and simulation results match the measurements on a complete design of a 940 MHz receiver implemented in a commercial foundry technology. A novel low-power delay line based ADC is also designed to facilitate the data gathering from the detector. Several circuit level and system level improvements are also implemented based on the prototype test chip.
Chapter 2

RF Built-in Test Method with Amplitude Detectors

2.1 Test Theory

Traditional RF measurements need high speed and high resolution instruments to handle the RF signals. In order to save expensive test equipment, and reduce time and cost, use of an on-chip detector is a promising solution.

The basic problem of an on-chip detector is to determine ways of transferring high frequency data, which is difficult and expensive to measure, into low frequency data, which is easy to record and analyze. In order to achieve the above target, novel detectors have to be designed, and corresponding test theories have to be developed.

2.1.1 Optimized RF Detector

For a detector to be qualified for RF testing, the following requirements have to be met.

1. Feature detection accuracy, i.e., a strong correlation between the detector output and the RF circuit specifications.
2. Low interference with the RF circuit operation, for which usually a high impedance is preferred.

3. Low area overhead.

In our main test approach, a novel on-chip amplitude detector[16] has been designed based on [7], and has been optimized for RF circuit specification test.

As shown in Figure 2.1, transistors M1 and M2 constitute the pseudo-differential pair to sense the mixer output. Resistors R1 and R2 are used to bias the gate voltage of the main transistors. Transistors M3 and M4 form the bias current mirror. Capacitors C1 and C2 block the DC current from influencing the detector operation. Cout is the output capacitor to sustain the output voltage, and works as part of the low pass filter with the output resistance.

Compared with the single-ended design in [7], two main improvements were adopted to facilitate RF test. First, the detector input was implemented as a differential structure, which is robust to common mode noise and can be matched to industry standard differential RF designs. However, the detector output was kept single-ended, which helps to reduce one output port compared to a differential one. The single-ended output design also helps to make the output voltage arrive at equilibrium faster. Second, the gate bias voltage was made higher than in the previous design, which helps to improve the output dynamic range.
For small signal analysis, the properly chosen current mirror bias keeps M1 and M2 transistors in the saturation region, following the long-channel drain current model. The output DC voltage level complies with the following formula [7].

\[ V_{out} = V_{th} + \sqrt{I_{bias} \cdot \frac{2}{\mu_n C_{ox}} \cdot \frac{L}{W} - A^2} \]  

(2.1)

where \( V_{th} \) is the threshold voltage, \( \mu_n \) and \( C_{ox} \) are process dependent parameters, \( W \) and \( L \) are transistor size parameters, and \( A \) is the input peak
amplitude.

For large signal analysis, the output signal is the squared version of the input signal, and is filtered through the low pass filter formed by the detector output resistance and the capacitor $C_{out}$. At the output port, only low frequency components are preserved.

Through analytical calculations and simulation, the transistor sizes of M1 and M2 were set to $5\mu m/0.18\mu m$, the output capacitor was determined to be $0.63pF$, and the bias current was set at $30\mu A$. The bias resistors $R_1$ and $R_2$ are $4k\Omega$ each. The input impedance of this RF detector was $7.6k\Omega$ at $1GHz$. Compared with impedances of hundreds of Ohms in RF circuit environments, the adverse influence of the detector on the mixer operation can be ignored. Detailed simulations show that the worst port matching ($S_{11}$ and $S_{22}$) degradation with the detector is less than $0.2$ dB at $1GHz$. The detector area is only $0.06X0.072mm^2$. The total power dissipation of the detector is $0.6mW$ in test mode. At other times, the detector test circuit can be switched down, so that power consumption is not a problem for this scheme.

The detector was characterized with the chip measurements in Figure 2.2, which shows that the detector keeps linear performance at a wide range with a $500mV$ amplitude input.
2.1.2 Test Theory for the Amplitude Detector

For the gain measurement, as shown in the previous section, the DC output of the amplitude detector is proportional to the signal amplitudes, which gives accurate gain information.

For the linearity measurement, suppose we have \( \omega_1 = 2\pi(f_0 + \Delta) \), \( \omega_2 = 2\pi(f_0 + 2\Delta) \) input tones. Considering the third order intermodulation for the circuit under test, the output signal

\[
v_o = a_1(v_i \cos(\omega_1 t) + v'_i \cos(\omega_2 t)) + a_3(v_i \cos(\omega_1 t) + v'_i \cos(\omega_2 t))^3 + \ldots
\]

(2.2)

where \( a_1 \) and \( a_3 \) are circuit gains for the first and third order respectively; \( v_i \) and \( v'_i \) are amplitudes of the two tones, which are usually equal (we set \( v_i = \)
Since the third order intermodulation is of interest here, we pick out the fundamental and third order frequency components for IIP3 calculation [18].

\[ v_{o,\text{fund} + \text{IM3}} = a_1 v_i \cos(\omega_1 t) + 3a_3 v_i^3 \cos(\omega_1 t)^2 \cos(\omega_2 t) \]  

(2.3)

Simplify further,

\[ v_{o,\text{fund} + \text{IM3}} = a_1 v_i \cos(\omega_1 t) + \frac{3}{4} a_3 v_i^3 \cos((2\omega_1 - \omega_2) t) \]  

(2.4)

Then, the circuit gain is \( a_1 \), and \( IIP_3 = \sqrt{\frac{4a_1}{3a_3}} \).

Thus, if we can get \( a_1 \) and \( a_3 \) information from the low frequency detector output, we will find an easy and low cost method for on-chip IIP3 measurement, as we do in this research.

The deduction, which is also proved by chip measurement is shown as follows. Following the square law characteristic of CMOS transistors, and using the circuit output as the detector input, the detector output can be noted as

\[ v_{o,\text{detector}} = a_d v_o^2 \]  

(2.5)

where \( a_d \) is the detector gain.

In our detector implementation, the detector is designed in such a way that it also works as a low pass filter at the output. After expanding the
expression, only the low frequency components at frequencies \((\omega_1 - \omega_2)\) and \(2(\omega_1 - \omega_2)\) are picked out.

\[
v_{o,\text{LowFreq}} = \left( (a_1^2 v_i^2 + 6a_1 a_3 v_i^4 + \frac{75}{8} a_3^2 v_i^6) \cos((\omega_1 - \omega_2)t) \right) t + \left( \frac{3}{2} a_1 a_3 v_i^4 + \frac{15}{4} a_3^2 v_i^6 \right) \cos(2(\omega_1 - \omega_2)t) a_d
\]

By measuring the two low frequency tones and with known input power level \(v_i\) and characterized detector gain, \(a_d, a_1\) and \(a_3\) can be calculated.

An instance of the measured frequency domain detector outputs are shown in Figure 2.3. In this measurement, two frequency tones were fed into the low noise amplifier (LNA) inputs, and the detector output reserved the DC and two low frequency tones, which were used to calculate the gain and IIP3.

It is shown in the simulation and measurement results that there is strong correlation between the RF detector output and the gain and IIP3 specifications.

It is not trivial to analytically solve the above equations, plus there are process variations and non-ideal corners, which the above simplified equations do not cover accurately. However, built-in test is different from the chip characterization measurement. For production and multiple wafer-level
test purposes, the resolution within a targeted range is good enough. Under this guideline, a non-linear regression method, multivariate adaptive regression splines (MARS) [19], is adopted to find a robust mapping function between the low frequency detector output and the circuit specifications. Thus, the calculation becomes a simple non-linear function.

It should be pointed out that there is one major difference between this method and previous approaches using “alternate test” [9], [14], etc. There is an explicit relation between the detector outputs and circuit specifications in our work (here, it is between $a_1$, $a_3$ and gain, IIP3). This is in contrast with
previous alternate test approaches, which had to perform exhaustive simulations to obtain the good stimulus and output features to form the mapping function (which is indirect), and which introduces additional complexity.

2.2 Mixer Test

Partly because mixer circuits involve frequency shifting, which introduces complicity in design, simulation and test, not many published research has discussed built-in tests for RF mixers. As an important building block for most of the RF systems, mixer specification test is investigated with the on-chip amplitude detector.

A 940MHz to 40MHz down conversion mixer was implemented in 0.18μm CMOS technology using UMC foundry models. The design used the differential Gilbert cell[21] with several improvements as shown in Figure 2.4.

Transistors M1 to M6 constitute the basic Gilbert cell. Resistors R1 and R2 are resistive loads. Current sources I1 and I2 are injected to improve the gain and linearity. Inductors L1 and L2 provide inductive source degeneration, so that the linearity is improved ([22] [18]). Under nominal working conditions, the conversion gain was 4dB, and the third order interception point (TOI) was -0.8dBm.

The amplitude detector is attached to the output of the mixer to be able to obtain the mixer performance data.
Figure 2.4: Circuit of Gilbert mixer with design improvements
2.2.1 Simulation Results

In order to guarantee accurate simulation results, the simulation setup was configured to match real cases to the maximum extent possible. The final circuit under simulation was extracted from the layout with all foundry data. The chip pads and bond wires with bond wire inter-coupling effect were modeled and added to the test bench. The bond wire inductor inter-coupling coefficient was set to 0.3 from previous tape out experience. The influence of the I/O pad Electrostatic Discharge (ESD) circuitry was also modeled and included in the simulation. Process variations were simulated by varying the supply voltage and the bias voltages at the RF and LO ports. This approach is reasonable, because all the process variations express their effects by changing the working conditions of the circuit, which correspond to different biasing conditions. This method was also used and proved through measurements in [23].

In the Monte-Carlo simulations, the parameter variations were set to be uniformly distributed within +/-15% around the nominal values. The RF detector output and the mixer specifications (conversion gain and TOI) were recorded at the same time. 150 instances were generated, among which 120 instances were used to make the nonlinear mapping function between detector outputs and the mixer specifications using the method of MARS [19], which is a weighted sum of a set of basis functions that span all values of each of the independent variables. The other 30 instances were then used to verify the accuracy of the predictions.
As in traditional mixer test, a two-tone input signal with frequency
components 940.1MHz and 939.9MHz at power levels of -10dBm was used.
The LO signal was 900MHz at the -5dBm power level. The detector output
was sampled with a 10MHz sampling rate (the number of 10MHz samples was
chosen from the simulations as a compromise between the prediction accuracy
and ADC requirement), and 5 microseconds of data were logged. Then the
digitized waveform was transformed with an FFT. The total test time was
only 5 microseconds for measurement plus the FFT computation time, which
is much shorter than today’s industry requirement of hundreds of milliseconds.
The low frequency components were used as the input parameters for the non-
linear mapping function.

It is worth mentioning that the sampling frequency (10MHz) of this
approach is very low, compared with what has been reported in other publi-
cations on similar topics (100MHz in [9], 71.7MHz and 89.1MHz in [25]). This
feature lowers the requirements of the on-chip ADC, and makes the scheme
easy to implement as a built-in self-test for systems with a digital signal pro-
cessor (DSP) on chip (such as in SoC environments).

2.2.2 Prediction Accuracy

Using the 30 instances, the measured specifications and the predicted
ones from the mapping function were combined in scatter plots in Figure 2.5
and Figure 2.6. A 45 degree reference line is inserted in each plot. The plots
show close matching of the actual and predicted performance parameters. As
the two primary specifications for mixer performance, the gain performance, conversion gain, and linearity performance, TOI, are illustrated to shown the accuracy of the measurements. Especially for traditional test schemes for the TOI specification, two-tone input power sweeping has to be performed, which requires complicated measurement setup and long measurement times. With our scheme, the complexity of measurement and test time are reduced significantly.

![RF Mixer Conversion Gain](image)

**Figure 2.5:** Specification prediction of conversion gain
In order quantify the accuracy of the technique, the measure of RMS error is employed, which is defined as

$$RMS_{error} = \sqrt{\frac{1}{N} \sum (P_{true} - P_{estimated})^2}$$  \hspace{1cm} (2.7)

The relative error, which is the RMS error divided by specification distribution range (difference between maximum and minimum specification values) was also calculated. The results are shown in Table 2.1.

It needs to be mentioned that the foundry RF models came with noise sources, and the extracted schematic also included noise resistors, so that
Table 2.1: Test Accuracy for Down Conversion Mixer

<table>
<thead>
<tr>
<th>ConversionGain</th>
<th>TOI</th>
</tr>
</thead>
<tbody>
<tr>
<td>RMS Error</td>
<td>0.120dB</td>
</tr>
<tr>
<td>Relative Error</td>
<td>0.9%</td>
</tr>
</tbody>
</table>

Table 2.2: Test Accuracy for Noise Robustness

<table>
<thead>
<tr>
<th>Injected Noise</th>
<th>Conversion Gain</th>
<th>TOI</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>RMS Error</td>
<td>Relative Error</td>
</tr>
<tr>
<td>1mV</td>
<td>0.178dB</td>
<td>2.4%</td>
</tr>
<tr>
<td>5mV</td>
<td>0.408dB</td>
<td>5.3%</td>
</tr>
<tr>
<td>10mV</td>
<td>0.936dB</td>
<td>11.4%</td>
</tr>
</tbody>
</table>

the above result showed good noise robustness. To further confirm the test scheme’s validity under noisy environments, 1mV, 5mV and 10mV Gaussian noise, respectively, was injected into the detector output. It can be noticed that, with a Cout of 0.63pF, the kT/C noise for this detector is around 80μV, which is much smaller than the noise we injected. So the injected noise levels are high enough to prove the noise robustness of the circuit.

The corresponding test accuracy results are shown in Table 2.2. We can see that, even with a high noise level of 10mV, the error in the values predicted from the detector measurements for the mixer specification is limited to around 10%.
2.3 RF Receiver Front-End Test

On the prototype test chip, we implemented the receiver RF front end circuit with an LNA and mixer, as illustrated in Figure 2.7. A separate detector was also implemented for characterization and calibration purposes.

![Figure 2.7: Receiver RF front end circuit blocks](image)

The LNA was designed as a fully differential structure with inductive source degeneration, and is shown in Figure 2.8. The nominal gain is 7dB, and the IIP3 is -1.5dBm.

2.3.1 Prototype Chip

The micrograph of the RF receiver test chip is shown in Figure 2.9. All the components were implemented with foundry provided layout models.
to ensure the accuracy of RF performance. The component layout followed symmetrical principles to reduce the influences of mismatch and substrate noise. In order to lower the detector area overhead, the DC block capacitors were implemented with mixed mode layout models. This process has a high resistance resistor option, which also helps to reduce the area of the detector.

The total effective receiver circuit die area was about 0.5X1.2\(mm^2\) excluding bond pads and ESD circuitry. One detector took up an area of 0.06X0.072 \(mm^2\). Therefore, the area overhead was only 1.4%, which is negligible. Compared with other publications, the area of this detector is one of
the smallest ([5], [6], [9]).

2.3.2 Test Setup and Measurement Procedure

The chips were mounted directly onto immersion gold RF printed circuit boards (PCBs) through bonding wires, with a tunable power supply and biasing knobs. To mimic process variation data, the chip supplies and node biasing conditions were swept with 10% variations. This approach is reasonable, because all the process variations express their effects by changing the working conditions of the circuit, which correspond to different biasing conditions. Four
dies across two different packages were used during the measurements. The test setup is shown in Figure 2.10. Two signal generators were used to generate input signal tones, which in SoC environments many come from on-chip PLLs. The Tektronix digital oscilloscope was used to record detector outputs, which were processed with an FFT to pick out the low frequency components. The spectrum analyzer was used to measure chip specifications (gain, IIP3) as the instrument measurement results, to compare with the values predicted from detector outputs.

Figure 2.10: Measurement setup
In total, 150 instances were measured, among which 120 instances were randomly picked as training cases to generalize the mapping function between detector outputs (DC and low frequency tones) and the circuit specifications (gain and IIP3) using the mathematical tool, MARS. The other 30 instances were used to verify the accuracy of the method.

As in traditional receiver test, when measuring the detector outputs, a two-tone input signal with frequency components 940.1MHz and 939.9MHz at power levels of -13dBm was used. (In the GSM standard, two tone separation is defined as 800kHz, which gives the same IIP3 value as 200kHz separation for this design, while 200kHz gains the benefit of higher over-sampling ratio, thus giving more accurate results.) The local oscillation (LO) signal was 900MHz at -2dBm power level. The detector in between the LNA and the mixer was used to get the LNA specifications. The detector after the mixer was to obtain the receiver system specification. Detector outputs were sampled at a 10MHz sampling rate with the Tektronix DPO 7104 Digital Oscilloscope. The number of 10MHz samples was chosen from the simulation results as a compromise between the prediction accuracy and ADC requirements. Depending on the measurement accuracy requirement, the sampling resolution can be selected. A 10 bit resolution was used in this test to achieve results within 5% error. As a result, 5 microseconds of data were logged. Then the digitized waveform was transformed with an FFT. The total test time was only 5 microseconds in measurement plus the FFT computation time. The low frequency components were used as the input parameters for the non-linear mapping function.
2.3.3 Measurement Results

Using the 30 instances, the measured specifications using high frequency RF instruments and the calculated ones from the detector outputs with the mapping function were combined in scatter plots in Figure 2.11.

With two on-chip detectors, the specifications of the individual components (the LNA and the mixer) are also calculated and the results are shown in Figure 2.12 and Figure 2.13.

The plots showed close matching of the measured and calculated performances for the receiver. As the two main specification for receiver performance, the gain performance and the linearity performance were illustrated to shown the accuracy of the prediction. With our approach, the complexity of measurement and test time are greatly reduced, while maintaining high accuracy.

In order to quantify the accuracy of the this method, the measure of RMS error was employed, which is defined as in Equation 2.7.

The relative error was also calculated. The results are shown in Table 2.3. It demonstrates that the built-in RF test method can achieve satisfying measurement results.

2.3.4 Results of Directly Solving Equations

The machine learning methods rely on the nonlinear modeling, which involves lengthy data training process. Here, the specification is directly cal-
Figure 2.11: Results of instrument measurement and detector calculation for 
(a) Receiver gain (b) Receiver IIP3
Figure 2.12: Results of instrument measurement and detector calculation for (a) LNA gain (b) LNA IIP3
Figure 2.13: Results of instrument measurement and detector calculation for (a) Mixer gain (b) Mixer TOI
Table 2.3: Test Accuracy for Receiver and Components

<table>
<thead>
<tr>
<th>Component</th>
<th>RMS Error</th>
<th>Relative Error</th>
</tr>
</thead>
<tbody>
<tr>
<td>RX Conversion Gain</td>
<td>0.043dB</td>
<td>1.3%</td>
</tr>
<tr>
<td>RX IIP3</td>
<td>0.169dBm</td>
<td>2.3%</td>
</tr>
<tr>
<td>LNA Gain</td>
<td>0.09dB</td>
<td>4.8%</td>
</tr>
<tr>
<td>LNA IIP3</td>
<td>0.15dBm</td>
<td>4.4%</td>
</tr>
<tr>
<td>Mixer Gain</td>
<td>0.12dB</td>
<td>6.5%</td>
</tr>
<tr>
<td>Mixer TOI</td>
<td>0.27dBm</td>
<td>4.3%</td>
</tr>
</tbody>
</table>

culated by solving equations to save the data training time.

Based on Equation (2.6), we take the two measured tone amplitudes A1 for frequency $(\omega_1 - \omega_2)$, A2 for frequency $2(\omega_1 - \omega_2)$, with known detector gain $a_d$ and input tone amplitude $v_i$.

\[
A_1 = (a_1^2v_i^2 + 6a_1a_3v_i^4 + \frac{75}{8}a_3^2v_i^6)a_d
\]  
(2.8)

\[
A_2 = (\frac{3}{2}a_1a_3v_i^4 + \frac{15}{4}a_3^2v_i^6)a_d
\]  
(2.9)

Solving the equations, we can pick reasonable roots for $a_1$ and $a_3$.

Then, the circuit gain is $a_1$, and $IIP_3 = \sqrt{\frac{4a_1}{3a_3}}$.

Comparing the calculated ones with the real measurement, we get the gain and IIP3 plot in Figure 2.14.
Figure 2.14: Results of instrument measurement and equation calculation for (a) RX gain (b) RX IIP3
To evaluate the accuracy, the error mean and standard deviation are listed in Table 2.4. As can be seen, the calculated and measured specifications follow the 45 degree lines in both of the plots. Equation (2.6) is derived under ideal circuit operating cases, however, in the real measurement cases, there are non-idealities, such as mismatch, power droop, even order harmonics etc., so the result is not as good as using nonlinear matching functions (for example, MARS). The advantage of this approach is that no training step for modeling is needed. The specifications can be directly calculated, thus saving computation complexity and measurement time.

2.4 RF Built-In Self Test with Detector Array Multiplexing Loopback

2.4.1 Introduction

Loopback is a test strategy to connect the TX with the RX path. By comparing the input of TX and the output of RX, a pass/no pass decision can be made. Usually, the comparison is performed in the baseband. However, in these schemes, RF blocks are invisible, thus no RF component specifications are measured. Moreover, there are chances that the over-designed parts com-

<table>
<thead>
<tr>
<th></th>
<th>Error Mean[dB]</th>
<th>Error Stdv[dB]</th>
</tr>
</thead>
<tbody>
<tr>
<td>RX Conversion Gain</td>
<td>0.10</td>
<td>0.34</td>
</tr>
<tr>
<td>RX IIP3</td>
<td>-0.36</td>
<td>1.04</td>
</tr>
</tbody>
</table>

Table 2.4: Calculation Test Accuracy for Receiver

32
pensate for the faulty parts in the path loop, which is called fault masking. This results in yield loss and low test accuracy.

In this research, we incorporate the on-chip RF detectors into loopback tests. With the help of the detector outputs, the RF components are also tested during a loop run. The fault masking problem is solved, and the full advantage of loopback test is achieved with low overhead and low test costs.

The big test picture is shown in Figure 2.15. The system under test is in a SoC environment, which includes TX and RX channels with a DSP core. In this setup, the TX output is connected to the RX input through a lossy link, which simulates the real space loss. Usually, there are up conversion mixer and power amplifiers in the TX path; LNA and down conversion mixer in the RX path, together with some pre-amplifiers, buffers and filters depending on different applications. By inserting on-chip RF detectors along the transceiver RF path, all the components are monitored. Since the on-chip detector outputs are low frequency, multi-monitor points can be time-multiplexed to the on-chip ADC. In the next chapter, an area and power efficient novel ADC designs for this application are discussed in detail.
Figure 2.15: Loopback test of RF transceiver front-end with detector array and on-chip ADC
2.4.2 Chip Design

On the RF chip, the receiver path is introduced in the previous section. A transmitter path with an up-conversion mixer and a pre-amplifier were also designed and implemented. It provides a device under test (DUT) of a full transceiver RF front-end chain.

As shown in Figure 2.16, the up conversion mixer is using a fully differential Gilbert cell, with resistive source degeneration for linearity and source

![Up conversion mixer diagram](image-url)
matching. Current injection is also adapted to improve the noise and gain performance. The output stage is a single ended design, since most antennas are single-ended. The differential to single-ended circuit is similar to that in [24]. The mixer converts a 40MHz signal up to 940MHz. The conversion gain is 1.5dB, and the TOI is -3dBm.

![Pre amplifier diagram](image)

**Figure 2.17: Pre amplifier**

After the up conversion mixer, a simple pre-amplifier was designed to amplify the signal further. Shown in Figure 2.17, it is a single stage current source loaded amplifier, and the source degeneration was implemented with bonding wires to save chip area. The amplifier gain is 9.1dB, and the IIP3 is -5dBm.
Since the TX path is an all single-ended design, the amplitude detectors are also designed as single-ended versions.

The full die photo is shown in Figure 2.18. The upper half portion is the TX path. Since no on-chip inductor is used, the area is much smaller than that of the RX path, where on-chip inductors are adopted for linearity and noise performances.

![Die photo with components marked](image)

Figure 2.18: Die photo with components marked
2.4.3 Test Setup and Measurement Result

To increase the testability and repeatability, the dies were wire-bonded into Ceramic Quad Flatpack (CQFP) packages, then the packages were soldered onto RF PCBs.

The soldered board is shown in Figure 2.19. Specific care has been taken during the design of the PCB. Large VDD and ground planes are used. All RF pins were connected through gold SMA connectors. In order to get better RF signal matching, a 50Ohm transmission line was calculated and implemented. Plane vias were designed to surround the critical signals line to ensure good shielding. A discrete band-pass filter was inserted onto the PCB to form the TX and RX loop. This helps to filter out the TX harmonics. Tuning knobs were designed over biasing points. Decoupling capacitors were placed along VDDs and ground to filter out supply noise.

Following the same test method as the RX test, with tunable power supply and biasing knobs, the chip supplies and node biasing conditions were swept with 10% variations to mimic process variation data. Four dies across two different packages were used during the measurements. The test setup is shown in Figure 2.20.
Two signal generators were used to generate input signal tones, and the Tektronix digital oscilloscope was used to capture detector outputs. As a measurement comparison with the on-chip detector result, the spectrum analyzer was used to measure chip specifications (gain and IIP3). The spectrum analyzer was controlled by a PC through local area network (LAN) connection. The digital oscilloscope was controlled by a laptop through LAN connections. Agilent VEE and Labview programs were developed to control the instruments and obtain the data.

150 instances were measured, among which 120 instances were ran-
domly picked as training cases to generalize the mapping function between detector outputs (DC and low frequency tones) and the circuit specifications (gain and IIP3) using the mathematical tool, MARS. Here, we did not attempt direct computation because of the more complicated non-ideality of the loopback setup. The other 30 instances were used to verify the accuracy of the method.

When measuring the detector outputs, a two-tone input signal with frequency components 40.1MHz and 39.9MHz at power levels of -13dBm was used. The LO signal was 900MHz at a -2dBm power level. The detectors were inserted after each of the components along the transceiver chain. Detector outputs were sampled at a 10MHz rate with the Tektronix DPO 7104 Digital Oscilloscope. 5 microseconds of data for each of the detectors were logged.
Then the digitized waveform was transformed with an FFT. The total test time was only 20 microseconds for measurement plus the FFT computation time. The low frequency components were used as the input parameters for the non-linear mapping function.

Using the 30 instances, the measured specifications using high frequency RF instruments and the calculated ones from the detector outputs with the mapping function were combined in scatter plots. A 45 degree reference line was inserted in each plot. With distributed on-chip detectors, the specifications of the individual components in both of the transmitter and receiver sides are also calculated and the results are shown in Figure 2.21 and Figure 2.26.

The plots show close matching of the measured and calculated performances for the receiver. As two main specification for receiver performance, the gain performance and the linearity performance were illustrated to shown the accuracy of the prediction. With our approach, the complexity of measurement and test time are greatly reduced, while keeping a high accuracy.

In order to quantify the accuracy of the this method, the measures of Error mean and standard deviation were employed. The results are shown in Table 2.5 and Table 2.6. They demonstrate that the built-in RF test method can achieve satisfying measurement results.
Figure 2.21: Loopback results of instrument measurement and detector calculation for (a) Up mixer gain (b) Up mixer TOI

(a) Up Mixer Gain

(b) Up Mixer TOI
Figure 2.22: Loopback results of instrument measurement and detector calculation for (a) Pre Amp gain (b) Pre Amp IIP3
Figure 2.23: Loopback results of instrument measurement and detector calculation for (a) Transmitter path gain (b) Transmitter IIP3
Figure 2.24: Loopback results of instrument measurement and detector calculation for (a) LNA gain (b) LNA IIP3
Figure 2.25: Loopback results of instrument measurement and detector calculation for (a) Down mixer gain (b) Down mixer TOI
Figure 2.26: Loopback results of instrument measurement and detector calculation for (a) RX gain (b) RX IIP3
Table 2.5: Loop-back TX Test Accuracy

<table>
<thead>
<tr>
<th></th>
<th>Up Mixer</th>
<th>Pre Amp</th>
<th>TX</th>
</tr>
</thead>
<tbody>
<tr>
<td>Gain Error[dB]</td>
<td>Mean: 0.02</td>
<td>Stdv: 0.11</td>
<td>Mean: 0.01</td>
</tr>
<tr>
<td>IIP3 Error[dB]</td>
<td>Mean: 0.02</td>
<td>Stdv: 0.49</td>
<td>Mean: 0.14</td>
</tr>
</tbody>
</table>

Table 2.6: Loop-back RX Test Accuracy

<table>
<thead>
<tr>
<th></th>
<th>LNA</th>
<th>Down Mixer</th>
<th>RX</th>
</tr>
</thead>
<tbody>
<tr>
<td>Gain Error[dB]</td>
<td>Mean: -0.27</td>
<td>Stdv: 0.23</td>
<td>Mean: -0.01</td>
</tr>
<tr>
<td>IIP3 Error[dB]</td>
<td>Mean: -0.32</td>
<td>Stdv: 0.45</td>
<td>Mean: -0.13</td>
</tr>
</tbody>
</table>

2.5 Noise Test with Built-in Detector

As an important circuit specification, noise figure measures the noise performance of the circuit, which is defined as the ratio of the signal-to-noise power ratio at the input to the signal-to-noise power ratio at the output.

\[ NF = \frac{S_i/N_i}{S_o/N_o} \]  

Since the detector is directly connected to the RF circuit output, the noise spectrum also shows up in the detector output. There is a correlation between NF and the low frequency tone of the detector output.

A test case was studied based on the pre-amp (Figure 2.17) designed for the transmitter path. 150 Monte Carlo simulations over process corners were performed for noise figure at 1GHz. 120 instances were used to generate the mapping model, and the other 30 instances were used to examine the method accuracy.
The results are shown in Figure 2.27. The error mean is 0.03dB, and standard deviation is 0.17dB. This shows that the on-chip detector is capable of accurately testing noise figure.

Figure 2.27: Pre-amp noise figure simulation results

A measurement setup was built to evaluate the noise figure measurement accuracy with real measurement data. With tunable power supply and biasing knobs, the chip supplies and node biasing conditions were swept with 10% variations to mimic process variation data. Two dies across two different packages were used during the measurements.
The test setup is shown in Figure 2.28. The noise figure measurements are made by measuring the output power of the DUT for two different input noise power levels. The high and low power inputs come from a calibrated noise source (Agilent 346C). The noise source is switched on and off in rapid succession. High power input to the analyzer uses the noise power generated when the noise source is switched on, and low power input uses the noise power generated at ambient temperature with the noise source switched off. The noise figures were calculated and recorded through the Agilent E4448A spectrum analyzer ([26] [27]).

Figure 2.28: Test setup of noise figure measurement
When measuring the detector outputs, a two-tone input signal with frequency components 940.1MHz and 939.9MHz at power levels of -15dBm was used. Detector outputs were sampled at a 10MHz sampling rate with the Tektronix DPO 7104 Digital Oscilloscope. 5 microseconds of data for each of the detectors were logged. Then the digitized waveform was transformed with an FFT.

150 instances were measured, among which 120 instances were randomly picked as training cases to generalize the mapping function between detector outputs (DC and low frequency tones) and the circuit noise figure using MARS. The other 30 instances were used to verify the accuracy of the method.

Using the 30 instances, the measured noise figure and the calculated ones from the detector outputs with the mapping function were combined in scatter plots in Figure 2.29. The error mean is 0.04dB, and the standard deviation is 0.09dB. Compared with Figure 2.27, the measured noise figure is a little larger than the simulated one, because of the systematic errors from jitter noise, PCB traces and connectors.
2.6 Online Corrections with Detectors

One of the promising applications of this on-chip detector is on-line adaptive calibration of RF performance.

With the on-chip detector running in the background, the circuit specifications are calculated on-line. Based on the performance data, a feedback loop can control the tuning knobs to optimize the overall performance. The control flow chart is shown in Figure 2.30.

If the performance is below specification, the tuning knobs can be tuned to increase the power or adjust the biasing values to improve the performance.
If the performance is above the specification, the tuning knobs can be tuned to reduce the current or voltage, thus saving power.

Figure 2.30: Feedback control flow chart

The tuning algorithm depends on the relation between bias setup and corresponding circuit performance. If there is a linear or monotonic relation
between bias control and targeted performance, the algorithm can be a simple linear step sweep. If not, either a comprehensive model has to be formulated using a lookup table implementation, or some random sweeping algorithm has to be implemented.

When the system sends in a control signal to turn off calibration, or there are too many iteration runs, the adaptive correction process will be stopped.

## 2.7 Other Built-in RF Detectors

In this research, other possible efficient on-chip detectors are also investigated, aiming at simple structure and small area and power overhead.

The following two kinds of detectors are analyzed.

### 2.7.1 Average Detector

One simple design targeting to easy measurement of RF amplitude is shown in Figure 2.31. M1 provides the biasing tail current; M2 and M3 forms the differential pair to sense the RF input. The output sensing node is at the drain of M1, and it is also filtered by a capacitive load.
Figure 2.31: Average detector

The detector average DC output is proportional to RF amplitude. Suppose the RF input is \( V_{in} = A \cos(\omega_{RF}t) \), for \( V_{in} \geq \sqrt{I_{in}/K} \):

\[
V_{out} = K_1 A
\]  

(2.11)

\( K \) is \( \frac{1}{2} \mu_n C_{ox} \frac{W}{L} \), where \( \mu_n \) is the mobility of charge carriers, \( C_{ox} \) is the gate oxide capacitance per unit area, and \( W, L \) are transistor sizes. \( K_1 \) is the coefficient for this linear relation.

The simulated transfer curve for this implemented circuit in 0.13\( \mu m \) CMOS process is shown in Figure 2.32. The non-linearity in the smaller input amplitude region is from the non-saturation operation of the transistors.
The advantage of this circuit is that it is simple and small. However, it can only provide DC amplitude information, and the linear region is limited.

### 2.7.2 Integrated Envelope Detector

Previous research used envelope detectors [9] for RF test. However, accurate diode implementations in CMOS processes require large area, and is thus not cost efficient.

The envelope detector in this research is designed without using diodes, thus saving a large layout area. As shown in Figure 2.33, M1 and M2 form the differential pair to sense RF signals; M3 and capacitor C work as a low pass filter to obtain the signal envelope.

Figure 2.32: Average detector characterization curve
Figure 2.33: Envelope detector

Figure 2.34 shows that the envelope detector senses a 1GHz and 100MHz amplitude modulation (AM) signal at the input (in blue), and the output (in red) gives a 100MHz envelope of the 100MHz signal. However, this detector is only capable of envelope filtering, and cannot directly transfer high frequency data down to low frequency.
2.8 Conclusions

A built-in test method has been developed for RF transceiver specification tests using simple amplitude detectors. A 940MHz RF transceiver test circuit was fabricated in 0.18μm CMOS technology. The measurement results showed accurate performance prediction under process variations. The detector output is very low frequency, and only needs a 10MHz sampling frequency, which imposes very low performance requirements on an on-chip ADC. The test stimulus was a simple two-tone signal that is very easy to generate.

Using the on-chip detectors, an RF loop-back test scheme is proposed and evaluated. Measurement results showed good system specification results, and discrete component performances are evaluated at the same time. Thus
a more complete test is performed with less time and complexity. Analysis of noise and different detectors structures have also been performed. The short test time, low area overhead and high accuracy make this technique very promising for industry production test of wireless transceivers.
Chapter 3

ADC Based on Linear Voltage Controlled Delay Line

3.1 Introduction

The previous chapter described how on-chip detectors show promise in RF system test applications. In order to implement complete system-level test, Analog to Digital Converters (ADCs) are also necessary to generate digital values for processing. In an RF system, there will be ADCs to digitize the down-converted signals in the receiver path. It might be possible to share these ADCs with on-chip detectors. However, complicated time sharing schemes have to be implemented in such cases. Additionally, if a background test is needed (in order to save test time, or to implement on-line adaptive parameter optimization), ADCs could not be shared.

In many applications, the ADCs in the signal path are high resolution and high speed designs, while on-chip detectors only need medium or low performance ADCs. Shared ADCs are also not power efficient.

Thus, on-chip ADCs dedicated to detectors are necessary, if the following conditions could be met.

1. The ADC consumes low power and has low area.
2. The ADC can work at an appropriate speed and resolution for detector data collection, as well facilitating background test and on-line adaptive parameter optimization.

The remainder of this chapter will describe the design of a novel ADC structure which can be implemented in fine-line digital technologies, allowing it to be integrated with the RF system and detector. This class of ADCs can also be used for other applications such as high performance wireless and wireline data communications.

3.1.1 Overview of ADC structures

The current explosive growth in wireless and wireline communications is the dominant driver for higher performance ADCs. New applications in wireless communications support multi-mode operation, utilize large portions of bandwidth, such as in the case of ultra wideband (UWB) and 60-GHz-band systems, or attempt to re-use the already licensed spectrum, thus requiring a high dynamic range for operation. Similarly, future wireline communication systems commonly extend the signal constellations to increase the data throughput, such as in the case of 10-Gb/s Ethernet or next-generation cable modems. These applications are driving the demand for high-resolution, high-speed, low power, and low cost integrated ADCs.

Technology scaling has traditionally been geared toward improving the performance and speed of DSP blocks (i.e., digital circuitry as opposed to analog) and significantly lowering the cost of digital logic and memory. Concur-
rently, there is an increased interest in using transistors with minimum possible dimensions to implement analog functions, because the improved device transition frequency, allows for faster operation. However, scaling adversely affects most other parameters relevant to analog designs, and ADCs are no exception. Achieving high linearity, high sampling speed and high dynamic range with low supply voltages and low power dissipation in ultra-deep submicron silicon technology is a major challenge. The most prominent challenge for implementing precision analog circuitry in deeply scaled, ‘digital’ processes is the reduction of supply voltages. This lowers the available voltage swings in analog circuits, fundamentally limiting the achievable signal to noise ratio (SNR). The intrinsic voltage gain of devices is one important gauge of device performance for precision analog designs. As scaling continues, the intrinsic gain also keeps decreasing due to lower output resistance as a result of drain-induced barrier lowering (DIBL) and hot carrier impact ionization.

The most straightforward analog to digital converter is the flash ADC [28]. In order to obtain N bit digital codes, $2^N - 1$ comparators work in parallel comparing the input signal with references. Many fast ADCs fall into this category. However, the main drawbacks of flash ADCs are exponentially increasing area and power with higher resolution. Nowadays, most 5 to 6 bit high speed ADCs with conversion rates above 500MHz are using this structure.

By encoding the bits in several steps of similar structures, pipeline ADCs achieve high speed and resolution, and are the most popular high performance Nyquist-rate ADCs [28] [29]. The most significant bits are first
digitized with a small flash ADC, then a DAC is used to subtract the digitized part from the analog signal. The residue voltage then is amplified and digitized by similar subsequent structures. At the same time, the first stage can handle the next sample. The main drawbacks of the pipeline ADC are complicated controls and large power because of the strict requirements on amplifiers and digital controls.

Subranging ADCs can be viewed as improved versions of flash ADCs [30] [31]. With coarse and fine ADC arrangements, the number of comparators are reduced. Subranging ADCs also reduce the requirements on amplifiers. However, this approach sacrifices speed, and introduces extra complexity.

Successive approximation (SAR) ADC is the most hardware efficient ADC [32]. This type of ADC works by sampling the signal into a weighted capacitor array, and usually produces one bit per clock period. The drawbacks of SAR ADCs are large capacitor array area, and lower speed.

ΔΣ ADCs are widely used in high resolution and low speed applications [33]. Through noise shaping, the input noise is filtered out, and high resolution can be achieved; however, ΔΣ ADCs need complicated digital portions to filter out the shaped quantization noise.

3.1.2 Time Domain ADC

In contrast to the traditional ADCs where voltage levels are compared, the new technique discussed in this chapter compares the time difference between different traveling waves. The advantages of the time domain ADC ap-
proach over traditional ADC architectures are particularly important in deep sub-micron integrated circuit processes. These processes are optimized for digital signals, where delay and timing are the focus rather than voltage levels. Accordingly, the systems built using such technologies can process, compare and analyze time much more robustly than voltage levels. In addition, certain digital circuit families consume extremely small static power compared to power-hungry voltage amplifiers used in voltage comparison architectures. The other advantage of this time ADC is its scalability. It is well known in the art that the performance of voltage comparison circuitry in integrated circuits is affected perversely by fabrication process scaling; however, timing comparison is not affected and moreover becomes even more power efficient as technology scales. Previous publications [34] [35] [36] used delay comparison to obtain monotonous digital codes; [37] counted voltage related frequency changes to generate digital codes. However, none of them achieved enough linear performance to be used as an ADC. In this paper, we demonstrate the first time domain ADC with 6 bit resolution.

3.2 Prototype Chip Design

The architecture of the implemented system is shown in Figure 3.1. It includes a sample and hold circuit, followed by the voltage to delay transfer block. The delay differences are compared along 63 fixed delay stages. Then the resulting thermometer code is encoded into binary by a Wallace tree encoder [39].
The input voltage signal is first sampled differentially. The sampled differential signal is used to set the delays of two voltage controlled variable delay paths. The fast path facilitates the propagation of the reference pulse, while the slow path delays the pulse edges. The time difference between these two paths is proportional to the sampled input voltage.
After the generation of an input-dependant differential delay, the two pulses go through two delay lines with different fixed delay blocks. After each block, the delay difference is increased by a fixed amount of time. For this 6 bit ADC, at 63 locations on these two transmission lines the delay difference is used to trigger the comparators. We set the delay threshold as 130ps. If the delay is greater than 130ps, the output of the delay comparator is associated to the digital value of '1', otherwise it remains '0'. The collective output of all the comparators creates a thermometer code representing the delay difference and essentially the input signal, which is further encoded into a binary code by a Wallace tree.

3.2.1 Voltage to Delay Building Block

After a simple pass-gate switch sample and hold circuit, the linearity of voltage to the delay block dominates the ADC linearity.

![Circuit topology of the N-block and P-block voltage-dependant variable delay blocks](image.png)

Figure 3.2: Circuit topology of the N-block and P-block voltage-dependant variable delay blocks
Unlike all previous delay line designs, which change the delay by varying power supply [34], this design uses a nMOS and nMOS current starved interleaved method, which produces enough linearity for this 6-bit ADC. The schematics are shown in Figure 3.2. The N block accelerates the pulse propagation under a high control voltage, while the P block does the opposite. The sampled signal is applied to the ‘CONTROL’ pin. A pulse passes through the blocks, and the delay time is determined by the sampled signal levels. The overall voltage to delay block is shown in Figure 3.3. In this way, in a wide input voltage range, a linear relationship is formed between the analog voltage and the pulse delay difference.

![Figure 3.3: Interleaved N-block and P-block forming the voltage-controlled variable-delay transmission line](image)

The sampled differential signals are applied to the P and N blocks. The delay difference between these two paths is proportional to the input voltage
level. The voltage to delay characterization curve is shown in Figure 3.4. It shows a better than 40dB signal to noise and distortion ratio (SNDR) over a 400mV input range. It is a sufficient linearity for a 6 bit ADC, which typically needs about 38dB SNDR [28].

![Figure 3.4: Input-delay transfer function of the voltage-controlled variable-delay transmission line](image)

**3.2.2 Delay Comparator**

Another important component in our scheme is the delay comparator. The schematic for this block is shown in Figure 3.5. During the preset phase, the capacitor is precharged to VDD; at the same time the Bit Output is set to
GND. In the evaluation phase, the time difference between Pulse \( p \) and Pulse \( m \) determines the discharge time for capacitor \( C \). The transistors \( M2 \) and \( M3 \) form the discharge path. Transistors \( M4 \) to \( M7 \) comprise the latch which sets the comparator flip threshold time, which is 130ps for this design. If the time difference is larger than the preset threshold of 130ps, the capacitor will be fully discharged and the Bit Output will change to VDD.

\( M8 \) to \( M11 \) constitute the output buffer to drive the following encoder stage.

![Delay comparator schematic](image)

Figure 3.5: Delay comparator schematic
3.2.3 Wallace Tree Structure

There are 63 comparators for this 6 bit ADC. The direct output is a thermometer code, which needs to be converted to a 6 bit binary code. The encoder scheme is chosen as a Wallace tree structure, which is fast, and can efficiently solve the bubble error in the thermometer code. The bubble error usually results from the timing difference between clock and signal lines and from comparator offsets. It is a situation where a ‘one’ may be found among ‘zeros’ or vice versa in the thermometer code [38].

The overall tree connections are shown in Figure 3.6. The basic building block is a full adder. As illustrated in the tree structure, A, B and C pins are the adder inputs, S is the sum output, Ca is the carry out. 5 levels of basic full adders are connected in the tree structure. 63 bits of thermometer code pass through the 5 levels of full adder network, and the output is a 6 bit binary code. From simulations, the critical path delay along this encoder is about 450ps in this 0.13 μm CMOS process.
Figure 3.6: 6 bit Wallace tree connections
3.3 Layout and Simulation Results

![Diagram of ADC layout in a 0.13μm digital CMOS process]

Figure 3.7: The ADC layout in a 0.13μm digital CMOS process

In Figure 3.7, we show the layout of the system in a standard 0.13μm digital CMOS process. This system has an area of approximately 0.3 mm$^2$. The high speed signals are put close to the chip edge, in order to reduce bonding wire length, thus reducing wire inductance. Calibration pins are for comparator offset calibration. The spare spaces are inserted with decoupling capacitors to guarantee clean VDDs and ground.

The simulation results of this ADC system in terms of differential and integral nonlinearity (DNL and INL respectively) are shown in Figure 3.8, and the dynamic performance is illustrated in Figure 3.9.
Figure 3.8: Results of simulated ADC DC performance (a) DNL (b) INL
Our analysis showed that this 6-bit ADC can work up to speeds of 300M Samples/s. It demonstrates 36.6dB SNR, 34.1dB SNDR (Effective Number of Bits (ENOB) is 5.4) for a 99MHz input, DNL<0.2LSB and INL<0.5LSB. The overall power consumption (including digital circuitry) is 2.7mW with a 1.2V power supply.

3.4 Chip Measurement

The die photo of the fabricated chip is shown in Figure 3.10. The die was wire-bonded into a CQFP package. The package was then soldered onto
a custom printed circuit board for the ADC characterization.

![Image of ADC with function blocks marked out]

Figure 3.10: Die photo of the ADC, with function blocks marked out

The finished board is shown in Figure 3.11. All the high speed lines are matched to 50 Ohm by calculating transmission line impedances. Large amounts of decoupling capacitance are inserted between the VDDs and ground. Potential meters are used for offset calibrations.
Figure 3.11: Custom printed circuit board for ADC test

Figure 3.12: ADC test setup
The test set up is shown in Figure 3.12. Two signal sources were used as clock input and signal input. The digital oscilloscope was used to capture the bit information, and saved through a LAN connection to a laptop. Then the codes were recombined in a Matlab program.

During the design phase, a test structure for each of the function blocks was implemented separately, which enables separate characterization of different blocks. The analog to delay transfer block is the main contributor to linearity performance. The measured voltage to delay transfer curve is shown in Figure 3.13.

![Figure 3.13: The measured voltage to delay transfer curve under different supply voltages](image-url)

Figure 3.13: The measured voltage to delay transfer curve under different supply voltages
Table 3.1: Linearity Variation

<table>
<thead>
<tr>
<th>Supply [V]</th>
<th>SNDR [dB]</th>
</tr>
</thead>
<tbody>
<tr>
<td>1.1</td>
<td>28.57</td>
</tr>
<tr>
<td>1.2</td>
<td>41.08</td>
</tr>
<tr>
<td>1.3</td>
<td>34.49</td>
</tr>
</tbody>
</table>

Since the resolution of the oscilloscope is limited to 2.5ps, the test structure was designed to amplify the delay by 48 times to reduce the measurement error. A supply dependent linearity change is observed. The corresponding linearity specification is shown in Table 3.1.

Figure 3.14: Delay pulse edge time distribution on Tektronix digital oscilloscope

78
As the pulses travel along the delay line paths, noise gets added to the delay information. At the end of the test structure, the pulse edge crossing point distribution is observed and shown in Figure 3.14. There is about 10ps variation for the clock crossing point. In the test ADC, all the noise adds along the pulse travel path, and is accumulated as Gaussian noise.

Clocking the ADC at 100MS/s and sweeping the input DC voltages, we get the digital code outputs shown in Figure 3.15.

![Figure 3.15: Analog input sweeping](image)

Here we see nonlinear segments along this transfer curve. The main non-linearity comes from the following.
1. Supply voltage variation and drop from wire resistance,

2. Comparator offset,

3. Added jitter along the 63 stages.

In the next section, detailed solutions to these non-linearity contributors are studied.

### 3.5 Solutions to Non-linearity Contributors

#### 3.5.1 Supply Voltage

Regarding to the sensitivity to supply variations of the analog to delay block in Figure 3.13, an on-chip bandgap is designed to guarantee an accurate power supply [40] [41].

The basic idea of bandgap voltage source is to use the sum of 'voltage proportional to absolute temperature' voltage sources with positive and negative temperature coefficients. The sum of these two sources generates a temperature independent power supply. A forward voltage of a pn-junction diode exhibits a negative temperature coefficient, and the difference between two bipolar transistors base-emitter voltages is positively proportional to temperature.

In a CMOS process with N-well options, a pnp bipolar transistor can be implemented. A bandgap voltage reference is illustrated in Figure 3.16.
The corresponding output reference voltage is

\[ V_{\text{out}} = V_{BE2} + (V_Tlnn)(1 + \frac{R_2}{R_3}) \]  

(3.1)

where \( V_T = kT/q \), and \( n \) is the diode size ratio.

At the same time, wide supply planes on chip and board are used to reduce the supply drop along the wiring lines.

Digitally controllable resistors \( R_2 \) and \( R_3 \) can also be implemented to fine tune the supply voltage. With these measurements, the supply voltage is well controlled.
Following the principle discussed above, a bandgap was designed in a 0.13μm CMOS process. The simulated output voltage over temperature is shown in Figure 3.17. It shows constant temperature performance over more than a 100 degree temperature variation.

![Bandgap voltage reference simulated result](image)

Figure 3.17: Bandgap voltage reference simulated result

### 3.5.2 Comparator Offset

Transistors have threshold voltage mismatch \[42\] \[43\],

\[
V_{offset} = \frac{a_{vth}}{\sqrt{WL}} \tag{3.2}
\]

where \(a_{vth}\) is the unit offset voltage, which is process dependent. \(W\) and \(L\) are the transistor sizes.
In Figure 3.5, the threshold of M4–M7 is critical to the comparison. The fix is to increase these transistor sizes, however, at the cost of larger parasitic capacitances, thus resulting in slower speed. So there is a tradeoff between sizing and speed. On the test chip, we incorporated tuning pins to compensate for the possible offset. With better characterization, it is possible to perform digital background calibration.

Another work-around is to use a pure digital flip-flop to compare the delay difference. Digital circuits can be viewed as the most stable version of analog circuits, since only ‘0’s and ‘1’s are present. The detailed design will be discussed in Section 3.7 which deals with the improved system.

3.6 Jitter Reduction and Measurement

Figure 3.14 shows that the accumulated jitter disturbs the pulse edge information, thus introducing errors.

In order to improve the clock edge performance, stronger clock buffers are needed to make the clock edge sufficiently sharp. System level innovations to reduce stage numbers will be discussed in the improved system (Section 3.7).

A novel on-chip jitter measurement circuit has also been designed to obtain the real jitter information, which will facilitate future jitter related research and design improvements.
### 3.6.1 Introduction

As clock and data rates of computer and communications systems increase, jitter becomes a crucial performance criterion. In a synchronous digital system, 1 ns of jitter can result in 1% data uncertainty with a 10 MHz system clock, while it is 10% data uncertainty for a 100 MHz clock [44]. Now that GHz systems are common, it is extremely important that jitter be taken care of during design and that appropriate test procedures are available to test manufactured chips for jitter. Many communications standards have added jitter requirements [45] [46], and the booming ultra-wide band (UWB) communications systems are also influenced by jitter performance [47] [48].

According to [49], 70 minutes are needed to measure a $10^{-12}$ bit error rate (BER); for a lower BER of $10^{-15}$, even with a very fast transmission rates of 10 Gb/s, it takes 2.5 hours to detect one bit error due to jitter [50]. By using jitter information, the system BER can be predicted ([51][52]) as

$$\text{BER}(X_s) = CDF(X_s) = \frac{1}{2}[1 - A + B]$$ (3.3)

where

$$A = \int_{-\infty}^{X_s} PDF_{\text{Left}}(\Delta X) d(\Delta X)$$ (3.4)

$$B = \int_{-\infty}^{X_s} PDF_{\text{Right}}(\Delta X) d(\Delta X)$$ (3.5)

where $X_s$ is the sampling instant, CDF is the cumulative distribution function of the total jitter, and PDF is the probability density function.
As data rates scale beyond several giga-bits per second, the use of external testing equipment becomes more and more difficult due to the requirements for accuracy, complexity, cost and visibility of I/O characterization [53]. Therefore, built-in self test (BIST) and design for testability (DFT) for jitter testing is becoming the industry practice [54].

Time to digital converters (TDC) based on a Vernier delay line (VDL) have been widely adopted in BIST jitter test because of their simple structure and high resolution ([55] - [60]). A Vernier delay line for measurements is shown in Figure 3.18.

![Vernier delay line sampler](image)

Figure 3.18: Vernier delay line sampler

The delay times \( t_1 \) and \( t_2 \) are designed so that \( t_1 > t_2 \). The measurement resolution is determined by \( (t_1-t_2) \). If the input data clock is leading the reference clock, when the reference clock catches up the data clock, it will be indicated by the corresponding flop-flop Q output. In this way, the data
jitter can be measured. However, the VDL based jitter measurement circuits suffer from problems including delay elements mismatch and the requirement for large silicon area.

In order to address the issues in VDL schemes, the authors of [61] proposed an oscillator based implementation, which achieved component invariant performance. This design can be viewed as a folded VDL structure and the resolution is set by the period difference between two oscillators (Figure 3.19).

![Figure 3.19: Vernier oscillator measurement circuit](image)

Jitter measurement has also been implemented using analog approaches. In [44] [62] [63] the authors present the designs of a time-to-voltage converter (TVC), which operates in a continuous analog mode without the use of a sampling clock.

However, all these solutions only considered one directional jitter, and do not discriminate between leading (earlier than desired clock) and lagging
(later than desired clock) jitter. In order to obtain the complete information on the jitter in a circuit, it is necessary to invert the positions of the reference clock and signal under test, and repeat the entire test. However, for some time dependent signals or case where the leading and lagging jitter are correlated, two separate tests may still not produce correct jitter information.

In this research, a complete design has been implemented to measure the two directions of jitter. The technique produces accurate information on the jitter distribution, and saves measurement time by doubling the chances of capturing jitter compared with previous designs.

3.6.2 Prototype Scheme

For a proof-of-concept of the ideas developed in this research, we based our design on a Vernier oscillator jitter measurement approach. It should be mentioned that this bi-directional idea can also be extended to VDL or VTC based jitter measurements. The whole system can be divided into four parts according to functionality as illustrated in Figure 3.20.

The edge detectors detect the rising edges of the signal and reference clocks, then generate pulses to indicate the approaching clock event.

After the edge detectors, there are circuits to discriminate between leading and lagging events and to generate a sign bit. The edge discrimination
Figure 3.20: Bi-directional jitter measurement

circuits guarantee that the earlier signal (either the signal under test or the reference clock) is sent to the path with the slower oscillator, and the later signal is sent to the path with the faster oscillator. The sign bit is generated to indicate a leading or lagging state. In this design, a high pulse output corresponds to a lagging signal, which arrives later than the reference clock.

There are two oscillators with different oscillation frequencies. Each one is triggered to oscillate at an input pulse edge. At the same time, the counter counts the periods of the faster oscillator.

The phase detector sends out a pulse when the faster oscillator catches up with the slower oscillator. Triggered by the phase detected pulse, the readout circuit records the counter output and sign bit. The phase detector output signal is also used as a feedback signal to reset the system. The detailed explanations of the circuits are given in the next section.
3.6.3 Circuit Implementation

3.6.3.1 Edge Detector

In this prototype design, a very simple D flip-flop is used as the edge detector with the reset and D tied together (Figure 3.21).

The detector is initialized by a low logic level to the RST&D input, then waits for the first rising clock event. When a rising clock edge arrives, the output will change from low to high.

![Figure 3.21: Edge detector](image)

3.6.3.2 Sign Bit Generation Circuit

We need to design logic circuitry to send the earlier phase detector output edge to the slower branch in the oscillator stage, and the later edge to the faster branch. In order to eliminate the asymmetrical behavior of normal NAND and NOR gates, symmetric NAND and NOR gates are carefully designed to treat the two edges equally (Figure 3.22).
Figure 3.22: (a) Symmetric NOR gate (b) Symmetric NAND gate

Leading and lagging edge selection and sign bit generation logic using the symmetric gates are shown in Figure 3.23.
3.6.3.3 Oscillator

Two triggered oscillators with different oscillation frequencies are designed as shown in Figure 3.24.
When the trigger input is high, the oscillator begins to oscillate. The difference between the periods of these two oscillators is tuned by the control voltage of delay element. In our simulation setup, the difference is set to 14.5ps, which is the resolution of the measurement system.

### 3.6.3.4 Phase Detector and Feedback

The schematic of the phase detector is shown in Figure 3.25. When the slower oscillator output changes from high to low, and if the rising edge of the faster oscillator signal arrives, the detector output will be high. This detector output is used to control the counter readout, and is used as the feedback signal to reset the edge detectors, oscillators and counter.

![Phase detector schematic](image)

Figure 3.25: Phase detector

### 3.6.4 Calibration and Simulation Results

The entire design was implemented with UMC 0.18μm CMOS foundry models, and simulated using the Agilent Advanced Design System (ADS) soft-
ware environment.

3.6.4.1 Calibration

Although the oscillation frequencies are known during simulation, calibration is necessary after the chip is fabricated because of unpredicted parasitics from layout and processes variations.

First, the signal input and reference clock input are connected together to receive one pulse. Because of the design with a symmetric branch, the two oscillators will start to oscillate at the same time. When the rising edge of the faster oscillator moves across one cycle of the slower oscillator, the phase detector will generate a high pulse, and the counter readout is recorded as a value $N_0$. Then the faster oscillation period $T_f$ and the slower oscillation period $T_s$ have the following relation:

$$N_0 \cdot T_f = (N_0 - 1) \cdot T_s$$

(3.6)

We want to know the system resolution $\Delta T = T_s - T_f$. From (4) we can get, $(N_0 - 1)\Delta T = T_f$

So,

$$\Delta T = T_f/(N_0 - 1)$$

(3.7)

$T_f$ can be calculated by measuring the cycle time of one of the counter bits.

$$T_f = \frac{T_c}{2^n}$$

(3.8)
n is the bit position referred to the least significant bit of the counter (starting from 0), \( T_c \) is the cycle time of the n-th counter bit. In this design, \( T_f = 0.9545 \text{ns} \) and \( N_0 = 67 \), so the resolution \( \Delta T \) is 0.0145ns.

### 3.6.4.2 Simulation Results

With the calibration data, the measurement can be performed. Two measurements are shown here as examples.

First, a pulse, which is 0.1ns later than reference clock, is sent into the input. The counter readout is 6, so the measured jitter value is \( 6 \times 0.0145 \text{ns} = 0.087 \text{ns} \), and the error is the difference between the measurement and the real delay values, which is 0.1 - 0.087 = 0.013ns. The waveforms are shown in Figure3.26.

For a higher negative jitter of -0.5ns (jitter edge comes 0.5ns later than the clock), the counter readout is 35, and the sign bit is 1. The measured jitter is \(-35 \times 0.0145 = -0.5075 \text{ns}\), and the error is 7.5ps.

In order to demonstrate the capability of the circuit to handle continuous random jitter measurement, a clock with jitter having a Gaussian distribution was generated in Matlab, and passed to the advanced design system (ADS) simulation setup. Because of the prohibitively long simulation time, a medium number of 110 measurement instances were recorded.

Figure 3.27 shows good tracking of the measured jitter distribution with the input jitter. The variance and mean of the input jitter are \( 0.091 \text{ns}^2 \) and
Figure 3.26: Measurement waveforms, from the top to the bottom: signal triggered edge, clock triggered edge, counter input and phase detector output -0.02617ns, and the measured values are $0.087ns^2$ and -0.01674ns, which shows that this bi-directional system can measure the jitter distribution accurately.
3.6.5 Discussions

- Error sources
  
  - Quantization noise

  When the counter counts the periods, there will be quantization error. This can be improved by setting a smaller $\Delta T$. However, because of the following two limiting elements we cannot set $\Delta T$ to be too small.

  - Resistor thermal noise
After each measurement circuit stage, the thermal noise accumulates and introduces errors. So keeping a simpler signal measurement path is a design strategy for noise reduction.

- Capacitance charging and discharging

The charges stored in the capacitances are determined by the previous and present signal polarities and settling time, which adds uncertainties to the measurement circuitry.

- Dynamic range

The dynamic range is limited by the minimum delay difference of the two oscillators and the uncertainty of the transient circuit operation. From the simulation, the absolute value of the minimum accurately measured jitter is around 50ps, which explains the small discrepancy around 0ns between the injected and the measured jitter histograms. The maximum accurately captured jitter is half of the period of the slower oscillator, from the simulation. It is about 500ps in this design, so the dynamic range is around 20dB for this system.

- Test time

For a single jitter event of time $T_j$, the test time $T$ is $T = \frac{|T_j|}{\Delta T} \cdot T_j$.

3.6.6 Conclusions

A bi-directional jitter measurement scheme has been developed, and a design was implemented with foundry models. A calibration scheme was
presented along with an explanation of simulation results. The results show good accuracy in the measured jitter information. This scheme achieves more detailed jitter information and shorter test time. This bi-directional scheme can also be used in VDL and TVC jitter measurement circuits by measuring both leading and lagging jitter edges. For the specific time domain ADC, the jitter measurement helps to investigate the influence of jitter and improve future designs.

3.7 Improved Designs

Based on the test chip results and the analysis of possible improvements, several system level innovations on the time domain ADC are proposed and evaluated using simulation in this section.

3.7.1 Robust ADC

In the previous discussion on comparator offset in section 3.5.2, a flip-flop design was proposed to work more robustly as a delay comparator. The following system structure is designed to incorporate flip-flops into the comparison scheme to produce a robust system with smaller offset.

The system structure is shown in Figure 3.28.

The sample and hold, and voltage to delay blocks are the same as in the prototype design. However, after the fast pulse and the slow pulse are
generated, the slow pulse is passed directly to 63 flip-flop data inputs. The fast pulse is delayed a fixed delay 'd' per stage along 63 delay blocks, similar to [36]. At each of the flip-flops, if the slow pulse edge is slower than the delayed version of the fast pulse, the bit output is high (the inverted 'Q' of flip-flop is the output bit); otherwise, a low bit is the output. The thermometer code is then encoded into binary code with the Wallace tree adder.

Although the robustness is improved with a pure digital flip-flop as comparator, the speed is not as good, since the fixed delay of a buffer for this 0.13μm CMOS technology is about 30ps. In order to cover the full 63 bit delay range, the voltage to delay block has to generate a sufficient delay, which can be as much as $30 \times 63$ps, plus the transmission delay through all other blocks. The simulated speed is now around 50MS/s.

A comprise between speed and robustness is to replace the 63 delay
stage with a Vernier delay line structure as shown in Figure 3.18. Then, each stage delay could be adjusted according to the speed and offset requirements.

3.7.2 Time Sub-ranging ADC

The series chain structure of the prototype chip has 63 delay stages, which introduces extra jitter noise and could result in a mismatch problem along the long chain. The improvement is to reduce the number of delay comparison stages, thus reducing the extra noise interference and increasing the speed.

The general architecture of the N bit analog-to-digital converter is illustrated in Figure 3.29. The analog input (the Sample& Hold circuitry is not shown here) is first converted into two pulses noted as 'Fast' and 'Slow' edges, and the pulse edge time difference is proportional to the amplitude of the voltage.

Suppose the full range analog input corresponds to a time delay difference of R. The two pulses pass through all the following stages. In each of the stages S (S=1,2,N-F, F is the number of finer code bits), the pulse width is compared with a reference time $R/2^S$; if it is longer than the reference, the digital code $\text{Bit}<N+1-S>=1$, otherwise $\text{Bit}<N+1-S>=0$.

With the sequential property of time delay, it is easy to implement the delay comparison by inserting the reference delay in the fast pulse path.
Figure 3.29: Time sub-ranging ADC

The following digital logic makes the decision of the output bit value, and correspondingly passes the delayed version of the two pulses (in the case of $\text{Bit} < N + 1 - S > = 1$), or the original two pulses (in the case of $\text{Bit} < N + 1 - S > = 0$). When the pulses become so small that the reference delay is difficult to generate, a finer F bit code generator is applied. Within the F bit code generator, finer $(2^F - 1)$ delay stages are connected sequentially as in the original prototype design. At each of the stages, the pulse delay difference is compared with a finer time threshold, thus generating a $(2^F - 1)$ bit thermometer.
code, which is converted into a binary code with a decoder.

To illustrate the principle of this method, a 6-bit example is presented in the following. The system described below in Figure 3.30 shows the advantages of our system both in terms of power performance and compatibility with sub-micron semiconductor fabrication processes.

![Figure 3.30: A 6-bit ADC using time sub-ranging architecture](image)

The input voltage signal is initially sampled differentially. In the voltage to delay transfer block, the sampled differential signal sets the delays of two variable delay active transmission lines. In one path it increases the speed and reduces the delay, which leads to the ‘Fast’ edge; while in the other it
decreases the speed and increase the delay, which leads to the ‘Slow’ edge. At the same time, the time difference between the edges is proportional to the difference between the signal levels.

At the following 3 stages, the two pulses are compared with a half full range delay, quarter full range delay, and one eighth full range delay, respectively. This generates the first three most significant bits (MSBs). Within the 3rd stage, a 7 stage finer delay comparison circuit is designed, which generates 7 bit thermometer code. Then through a Wallace Tree adder, the 3 least significant bits (LSBs) are converted from the thermometer code to the corresponding binary code.

The digital logic within each of the stages is similar. As shown in Figure 3.31, the ‘Fast’ pulse is first delayed by the reference delay block, and then compared with the ‘Slow’ pulse. In this design, if after the reference delay, the ‘Fast’ pulse is still earlier than the ‘Slow’ pulse, the D Flip-flop will generate a ‘1’, otherwise a ‘0’ as the MSB. Then the MSB will also be used as a multiplexer select signal to decide whether the delayed ‘Fast’ pulse or the original ‘Fast’ pulse will be passed to the next stage.

Because of the dramatic reduction in the number of delay comparison stages (from 63 to 7), the analysis with a standard digital 0.13μm CMOS process shows that this 6-bit ADC can work up to speeds of 1G Samples/s, with better than 5 bit ENOB, and less than 10mW power consumption.
3.7.3 Pipeline ADC

The series property of the single ADC also makes it possible to pipeline the design to achieve faster speed. Even with extra control logic introduced, the overall speed will be accelerated. The drawback of this scheme is that it requires a very fast sample and hold circuit, which is usually power hungry. An accurate timing scheme is also required.

The system structure is illustrated in Figure 3.32. With pipeline buffers added at each clock, the result from each stage will be recorded. With proper timing and logic control, the appropriate stage results are combined.
3.8 Conclusions

In this chapter, a novel time domain ADC for on-chip RF detector output sampling is prototyped and analyzed.

Based on a linear voltage to delay transfer, the main digitizing function is performed in time with digital logic, thus achieving a more efficient design in terms of power and speed.

From the characterization results of the test chip, several improvements are designed and analyzed. System level innovations have also been investigated. It is worth mentioning that the time domain ADC investigated in this research follows the process advancing trend, and is very promising in high (or medium) speed and medium (or low) resolution applications.
Chapter 4

Conclusions and Future Directions

A built-in scheme for test of RF subsystems based on built-in detectors was proposed, and test chips were designed, simulated, fabricated in a standard CMOS process, and tested with packages and boards in the lab, including a 940MHz RF transceiver frontend chip in 0.18 μm CMOS process, and a time domain ADC prototype chip in 0.13 μm CMOS process.

For the RF subsystem test, an improved on-chip amplitude detector was designed, and the corresponding test theory was developed. With the on-chip detector, a down conversion mixer was tested under process variations and influence of noise. A receiver path was simulated and measured with the embedded detectors. Both system specifications and the LNA and mixer performance were evaluated through the detector outputs. Simulation and measurement results showed accurate specification prediction.

Then a loopback test for a full transceiver RF frontend was performed. The detectors were inserted into the transmitter and receiver paths. Transmitter and Receiver specifications were predicted accurately from the detector outputs, and discrete RF components were characterized at the same time through the detector array. Two new detectors were also designed and ana-
In order to facilitate the on-chip RF test, a novel low power, high speed ADC was designed. The simulation results show 36.6dB SNR, 34.1dB SNDR for 99MHz input, DNL<0.2LSB, and INL<0.5LSB. Overall chip power is 2.7mW with a 1.2V power supply. The measured voltage to delay transfer block has the linearity of 41dB SNDR under a 1.2V power supply. Based on the chip characterization, several design improvements were proposed for a more optimal design. System level innovations were also studied to achieve designs which are more robust and efficient with respect to and power and speed.

4.1 Future Directions

Future market driven integrated circuits development will boom in the consumer electronics area. High volume, medium performance and high level of integration are some the features of this class of electronics. The SoCs of the future including microprocessor blocks, graphics processing units and cell phone chips, and chips in personal computers (PCs) and other consumer electronics products, such as bluetooth, global positioning system (GPS), universal serial bus (USB), and digital versatile disc (DVD), are expected to be implemented in the latest standard CMOS processes to take the full advantage of process migration and reduce die area and power consumption.

The main competition among the chip design houses and manufacturers will be focused on chip functionality with minimum area and power
consumption, and high yield. The area and yield requirements are all, in fact, cost critical. With more RF and analog integration, the die area and test cost are increasing. Now RF test issues are becoming a very important part of the design from the beginning for large systems, and need to be planned systematically.

While most of the following aspects are covered in the research described in this dissertation, related research is suggested for future exploration.

1. System level co-design and plan for on-chip RF test

   With the complexity of a SoC increasing, it is worthwhile to plan RF test from the design phase. The critical nodes which need to be probed, the signal flow, test load etc., can all be organized efficiently, if a systematic plan is followed.

2. Base band test with on-chip detector

   With more powerful baseband cores embedded today in a SoC, it is more efficient to map all the test parameters to the baseband, or to lower frequencies. RF specifications can be coupled to lower frequency data through on-chip detectors, thus facilitating base band processing.

3. Feature characterization for detectors and circuits

   For different applications, there are optimum detectors for specific circuits. More research needs to be done to identify the relationship between detector outputs and circuit specifications.
4. On-chip adaptive test and digital calibration

With detectors for on-chip RF test, the measured specifications can be used to calibrate the circuit with digital feedback tuning. In this work, we demonstrated one-board level tuning algorithm. In the future, this method can be combined with programmable analog design and digital-aided analog calibration.
Bibliography


[26] Fundamentals of RF and Microwave Noise Figure Measurement. *Agilent Application Note AN 57-1*

[27] Agilent ESA-E Series and PSA Series Spectrum Analyzer Noise Figure Measurement Personality Guide Option 219.
[28] Rudy van de Plassche. CMOS Integrated Analog-to-Digital and Digital-


Vita

Chaoming Zhang was born in Wenshang County, Shandong Province, China on 28 January 1980, the son of Jindong Zhang and Guiqing Wang. He received the Bachelor of Science degree in Electronic Information and Technology from Shandong University, China in 2003. He then was enrolled into University of Ulm, Germany, and received the Master of Science degree in Communications Technology in 2005, majoring in communications integrated circuit design. In 2004, he interned at DaimlerChrysler AG Research and Technology, Ulm, working on Electromagnetic Compatibility (EMC) test system design. In 2005, he interned at Infineon Technologies AG, Ulm, where he worked on communication IC test. He started his Ph.D. work on Mixed signal and RF IC design and test in September 2005 at the University of Texas at Austin. In Summer 2007, he did an internship at Alereon Inc. Austin, working on high speed low power ADC for UWB ICs. From Fall 2007, he has been working at Broadcom as an analog & RF designer.

Permanent address: 1630 West Sixth Street, Apt. M
Austin, Texas 78703

This dissertation was typeset with $\LaTeX$™ by the author.

$\LaTeX$™ is a document preparation system developed by Leslie Lamport as a special version of Donald Knuth’s $\TeX$ Program.