Copyright by Wenjuan Guo 2016 The Dissertation Committee for Wenjuan Guo certifies that this is the approved version of the following dissertation:

# Fully-Passive Switched-Capacitor Techniques for High Performance SAR ADC Design

Committee:

Nan Sun, Supervisor

Michael Orshansky

Ahmed H. Tewfik

T. R. Viswanathan

Rachel A. Ward

# Fully-Passive Switched-Capacitor Techniques for High Performance SAR ADC Design

by

Wenjuan Guo, B.E.

#### DISSERTATION

Presented to the Faculty of the Graduate School of The University of Texas at Austin in Partial Fulfillment of the Requirements for the Degree of

### **DOCTOR OF PHILOSOPHY**

THE UNIVERSITY OF TEXAS AT AUSTIN

May 2016

Dedicated to my parents, sister, and husband.

## Acknowledgments

Five years ago, when I first set my foot on this land which was completely strange to me, I knew the future would be a long and tough journey. But I never guessed how far I would travel and how many people would guide me along the way. It is due to these people that all the ups and downs, backs and forths I went through become such beautiful memories that I will keep in mind forever. For me, this thesis is more like a memoir of the days and nights when we worked together than a pure summary of my research.

First of all, like every student, I owe an irredeemable debt to my supervisor, Dr. Nan Sun. He is not only a teacher giving me knowledge and advice, but also a friend who always stands by my side and fully supports me. Without his guidance and support, this thesis and all my research would have not been possible. I would also like to express my sincere gratitude to my committee members, Dr. Michael Orshansky, Dr. Ahmed H. Tewfik, Dr. T. R. Viswanathan and Dr. Rachel A. Ward for their valuable time and critical comments. I would like to thank my manager, Ms. Tsedeniya Abraham and my mentor, Dr. Chintan Trehan when I was doing co-op in Texas Instruments. Working with them not only improves my professional skills but also builds up my confidence of expressing myself and communicating with others. I am also greatly indebted to Ms. Laura Herriott who endowed me with 2014-2016 Texas Instrument Fellowship. It is due to no worry on finances that I can completely devote myself to study and research. Another important person I have to thank is my partner, Youngchun Kim. Thank him for leading me into the field of Compressive Sensing. I would also like to thank my labmates, Kareem Ragab, Dr. Arindam Sanyal, Yeonam Yoon, Long Chen, Shaolan Li, Xiyuan Tang, Sungjin Hong and Jeonggoo Song for making my PhD life full of happiness and joy.

Whenever it is, family is my strongest reliance. However, to realize my dreams, I have to be away from from my parents, sister, and husband. With my sorry, I greatly appreciate their sacrifices and support, which motivates me to work harder. I especially thank my dear husband, Dr. Yuankai Yue, who encourages me and accompanies me through all the difficulties. At this moment, I am full of gratitude to all the people who give me assistance along the way. You make me one step closer to reuniting with my family. Thank you.

## Fully-Passive Switched-Capacitor Techniques for High Performance SAR ADC Design

Wenjuan Guo, Ph.D. The University of Texas at Austin, 2016

Supervisor: Nan Sun

In recent years, SAR ADC becomes more and more popular in various lowpower applications such as wireless sensors and low energy radios due to its circuit simplicity, high power efficiency, and scaling compatibility. However, its speed is limited by its successive approximation procedures and its power efficiency greatly reduces with the ADC resolution going beyond 10 bit. To address these issues, this thesis proposes to embed two techniques: 1) compressive sensing (CS) and 2) noise shaping (NS) to a conventional SAR ADC. The realization of both techniques are based on fully-passive switched-capacitor techniques.

CS is a recently emerging sampling paradigm, stating that the sparsity of a signal can be exploited to reduce the ADC sampling rate below the Nyquist rate. Different from conventional CS frameworks which require dedicated analog CS encoders, this thesis proposes a fully-passive CS-SAR ADC architecture which only requires minor modification to a conventional SAR ADC. Two chips are fabricated

in a 0.13  $\mu$ m process to prove the concept. One chip is a single-channel CS-SAR ADC which can reduce the ADC conversion rate by 4 times, thus reducing the ADC power by 4 times. In many wireless sensing applications, multiple ADCs are commonly required to sense multi-channel signals such as multi-lead ECG sensing and parallel neural recording. Therefore, the other chip is a multi-channel CS-SAR ADC which can simultaneously convert 4-channel signals with a sampling rate of one channel's Nyquist rate. At 0.8 V and 1 MS/s, both chips achieve an effective Walden FoM of around 5 fJ/conversion-step.

This thesis also proposes a novel NS SAR ADC architecture that is simple, robust and low power for high-resolution applications. Compared to conventional  $\Delta\Sigma$  ADCs, it replaces the power-hungry active integrator with a passive integrator which only requires one switch and two capacitors. Compared to previous 1<sup>st</sup>order NS SAR ADC works, it achieves the best NS performance and can be easily extended to 2<sup>nd</sup>-order. A 1<sup>st</sup>-order 10-bit NS SAR ADC is fabricated in a 0.13  $\mu$ m process. Through NS, SNDR increases by 6 dB with OSR doubled, achieving a 12bit ENOB at OSR = 8. An improved version of a 2<sup>nd</sup>-order 9-bit NS SAR ADC is designed and simulated in a 40 nm process. The SNDR increases by 10 dB with OSR doubled, achieving a 14-bit ENOB at OSR = 16. At a bandwidth of 312.5 kHz, the Schreier FoM is 181 dB and the Walden FoM is 12.5 fJ/conversion-step, proving that the proposed NS SAR ADC architecture can achieve high resolution and high power efficiency simultaneously.

# **Table of Contents**

| Acknowledgments |         |                                           | v    |
|-----------------|---------|-------------------------------------------|------|
| Abstra          | ct      |                                           | vii  |
| List of '       | Tables  |                                           | xii  |
| List of ]       | Figures | 3                                         | xiii |
| Chapte          | r1. I   | ntroduction                               | 1    |
| Chapte          | r 2. S  | ingle-Channel Compressive Sensing SAR ADC | 7    |
| 2.1             | Bac     | kground                                   | 7    |
| 2.2             | Cor     | npressive Sensing Theory                  | 11   |
|                 | 2.2.1   | Sparsity                                  | 12   |
|                 | 2.2.2   | Incoherence                               | 12   |
|                 | 2.2.3   | Reconstruction                            | 14   |
| 2.3             | Cor     | npressive Sensing Frameworks              | 16   |
|                 | 2.3.1   | State-of-The-Art                          | 16   |
|                 | 2.3.2   | Proposed CS SAR ADC Architecture          | 18   |
| 2.4             | Circ    | cuit Implementation and Analysis          | 22   |
|                 | 2.4.1   | Clock Generator                           | 22   |
|                 | 2.4.2   | Mixer                                     | 22   |
|                 | 2.4.3   | DAC Array                                 | 25   |
|                 | 2.4.4   | Comparator                                | 28   |
|                 | 2.4.5   | Asynchronous SAR Logic                    | 29   |
| 2.5             | Mea     | asurement Results                         | 30   |
|                 | 2.5.1   | Nyquist-Mode Measurement Results          | 31   |
|                 | 2.5.2   | CS-Mode Measurement Results               | 34   |
|                 | 2.5.3   | Chip Performance Comparison               | 38   |

| Chapte | 3. Multi-Channel Compressive Sensing SAR ADC                                | 42 |
|--------|-----------------------------------------------------------------------------|----|
| 3.1    | Background                                                                  | 42 |
| 3.2    | Proposed Architecture                                                       | 46 |
| 3.3    | Introduction to OMP and SOMP                                                | 49 |
| 3.4    | Circuit Implementation                                                      | 53 |
|        | 3.4.1 Low Power Switching Without Common-Mode Voltage Variation             | 54 |
|        | 3.4.2         DAC Arrangement                                               | 55 |
|        | 3.4.3 4-Channel Sampling                                                    | 56 |
|        | 3.4.4 Synchronous Clock Generation and SAR Logic                            | 58 |
| 3.5    | Measurement Results                                                         | 59 |
|        | 3.5.1 Nyquist-Mode Measurement Results                                      | 61 |
|        | 3.5.2 CS-Mode Measurement Results                                           | 62 |
|        | 3.5.2.1 Discrete-Tone Signals                                               | 62 |
|        | 3.5.2.2 Real-World Sparse Signals                                           | 66 |
| Chapte | 4. Noise Shaping SAR ADC                                                    | 73 |
| 4.1    | Background                                                                  | 73 |
| 4.2    | Proposed 1 <sup>st</sup> -Order NS SAR ADC Architecture                     | 76 |
| 4.3    | Chip Measurement Results for The Proposed 1 <sup>st</sup> -Order NS SAR ADC | 81 |
| 4.4    | Proposed 2 <sup>nd</sup> -Order NS SAR ADC Architecture                     | 85 |
| 4.5    | SPICE Simulation Results for The Proposed $2^{nd}$ -Order NS SAR ADC        | 88 |
| Chapte | 5. Conclusion                                                               | 92 |
| Append | X                                                                           | 94 |
| Append | x 1. List of Publications                                                   | 95 |
| 1.1    | Patents                                                                     | 95 |
| 1.2    | Published Papers                                                            | 95 |
| 1.3    | Submitted Papers                                                            | 96 |
| 1.4    | Papers in Preparation                                                       | 96 |
| 1.5    | Miscellaneous                                                               | 96 |

### Bibliography

Vita

97

104

# List of Tables

| 2.1 | Comparison with state-of-the-art CS works | 41 |
|-----|-------------------------------------------|----|
| 3.1 | Comparison with state-of-the-art CS works | 72 |
| 4.1 | Comparison with state-of-the-art CS works | 85 |
| 4.2 | Performance summary and comparison        | 91 |

# **List of Figures**

| 1.1  | ADC performance comparisons in terms of different architectures                                                                                                                                                                   |     | 2       |
|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|---------|
| 1.2  | SAR ADC architecture                                                                                                                                                                                                              |     | 3       |
| 1.3  | 4-bit SAR conversion                                                                                                                                                                                                              |     | 5       |
| 2.1  | Sparse signal acquisition systems. (a) Nyquist-rate ADC with sub-<br>sequent digital compression. (b) Feature extraction with subsequent<br>low-rate ADC. (c) CS encoder with subsequent low-rate ADC. (d)<br>Proposed CS SAR ADC |     | 8       |
| 2.2  | State-of-the-art CS frameworks and proposed CS framework. (a)<br>Random demodulator. (b) Random-modulation preintegrator/modulat<br>wideband converter. (c) Non-uniform sampler. (d) Proposed CS<br>framework.                    | teo | d<br>17 |
| 2.3  | Circuit and timing diagram for the proposed 12-bit CS SAR ADC.                                                                                                                                                                    |     | 19      |
| 2.4  | DAC array configuration of the proposed 12-bit CS SAR ADC in:<br>(a) the sampling cycles $\phi_1 \cdot \phi_4$ and (b) the quantization cycle $\phi_5$                                                                            |     | 21      |
| 2.5  | Clock generator circuit diagram.                                                                                                                                                                                                  |     | 23      |
| 2.6  | Passive mixer. (a) Direct implementation. (b) Improved implemen-<br>tation.                                                                                                                                                       |     | 24      |
| 2.7  | Power spectrum of a PRBS in: (a) the continuous-time domain and (b) the discrete-time domain.                                                                                                                                     |     | 26      |
| 2.8  | 1-bit DAC array for 3-bit resolution.                                                                                                                                                                                             |     | 27      |
| 2.9  | Comparator architecture.                                                                                                                                                                                                          |     | 29      |
| 2.10 | Asynchronous SAR logic architecture.                                                                                                                                                                                              |     | 30      |
| 2.11 | Die photo of the fabricated CS SAR ADC                                                                                                                                                                                            |     | 31      |
| 2.12 | Power breakdown of the CS SAR ADC chip at 0.8 V and 1 MS/s in: (a) the Nyquist mode and (b) the CS mode                                                                                                                           |     | 32      |
| 2.13 | Measured static performance: (a) DNL and (b) INL                                                                                                                                                                                  |     | 32      |
| 2.14 | Measured output spectra with a -3 dBFS Nyquist-rate input                                                                                                                                                                         |     | 33      |
| 2.15 | Measured SNDR/SFDR trends with: (a) different input frequencies and (b) different input amplitudes.                                                                                                                               |     | 34      |

| with different reconstruction algorithms                                                                                                                                                                                                                                      |     | 36       |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|----------|
| 2.17 Time and frequency domain comparisons of the discrete-tone signals (in black) and the corresponding SL0-reconstructed signals (in red) with: (a) $K/2 = 1$ (b) $K/2 = 12$ .                                                                                              |     | 37       |
| 2.18 Comparisons of the 1-second long speech signal $\vec{s}$ (in blue) and the corresponding SL0-reconstructed signal $\vec{s}^*$ (in red) in: (a) the time domain and (b) the frequency domain. The error signal (in blue) is $\vec{s} \cdot \vec{s}^*$ in the time domain. |     | 40       |
| 3.1 Conventional multi-channel A/D conversion with fewer ADCs: (a) time multiplexing, (b) frequency multiplexing, and (c) Walsh-Hadan Coding.                                                                                                                                 | nar | rd<br>44 |
| 3.2 Proposed system architecture with $M$ channels                                                                                                                                                                                                                            |     | 46       |
| 3.3 Under-determined equation in the proposed CS framework                                                                                                                                                                                                                    |     | 48       |
| 3.4 Proposed circuit diagram of a CS-based 12-bit 4-channel SAR ADC. The switches controlled by the sampling clock, $\phi_1$ , the calibration signal, $cal$ , and the digital outputs, $d < 11 : 0 >$ , are labeled in green, blue and red, respectively.                    |     | 54       |
| 3.5 Proposed switching technique for a 3-bit SAR ADC.                                                                                                                                                                                                                         |     | 55       |
| 3.6 Calibration steps for the largest capacitor $2^9C$                                                                                                                                                                                                                        |     | 57       |
| 3.7 DAC configuration for 4-channel sampling.                                                                                                                                                                                                                                 | •   | 57       |
| 3.8 Proposed 12-bit 4-channel CS SAR ADC timing diagram                                                                                                                                                                                                                       |     | 59       |
| 3.9 Synchronous SAR logic architecture                                                                                                                                                                                                                                        |     | 60       |
| 3.10 Chip die photo and layout.                                                                                                                                                                                                                                               |     | 61       |
| 3.11 Measured ADC performance without PRBS. (a) SFDR & SNDR vs. input frequency, and (b) SFDR & SNDR vs. input amplitude.                                                                                                                                                     |     | 63       |
| 3.12 Measured output spectra when inputting a 100.016 kHz, 200.016 kHz, 300.016 kHz, and 400.016 kHz -3 dBFS sinusoidal wave to each channel, respectively.                                                                                                                   | •   | 64       |
| 3.13 Power breakdown                                                                                                                                                                                                                                                          |     | 64       |
| 3.14 Test bench diagram                                                                                                                                                                                                                                                       |     | 65       |
| 3.15 Measured post-reconstruction SNDR with different total channel occupancies and reconstruction algorithms when all channel signals are independent.                                                                                                                       |     | 66       |
| 3.16 Measured time-domain (left) and frequency-domain (right) results of the input signals $s_m$ (blue) and reconstructed signals $*s_m$ (red) via the SL0 method in the single-tone case (upper) and 26-tone case (lower).                                                   |     | 67       |

| 3.17 | Measured post-reconstruction SNDR with OMP and SOMP when all channel signals are highly correlated.                                                                                                                            | 68 |
|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 3.18 | Measured time-domain results of 3 Frank lead ECG signals via the SOMP method. $s_m$ (blue) are the input signals, $*s_m$ (red) are the reconstructed signals, and $*s_m$ - $s_m$ (black) are their differences                 | 69 |
| 3.19 | Measured frequency-domain results of 3 Frank lead ECG signals via the SOMP method. $s_m$ (blue) are the input signals, and $*s_m$ (red) are the reconstructed signals                                                          | 70 |
| 3.20 | Measured time-domain (left) and frequency-domain (right) results<br>of the 4-channel 1s-long speech signals via the SL0 method. $s_m$<br>(blue) are the input signals, and $\hat{s}_m$ (red) are the reconstructed signals.    | 71 |
| 4.1  | Noise shaping SAR ADC architecture proposed in [Fredenburg and Flynn [2012]].                                                                                                                                                  | 74 |
| 4.2  | Noise shaping SAR ADC architecture proposed in [Chen et al. [2015]].                                                                                                                                                           | 75 |
| 4.3  | NTF magnitude comparisons with zeros at 0.5, 0.64 and 0.75                                                                                                                                                                     | 76 |
| 4.4  | Proposed $1^{st}$ -order NS SAR ADC architecture                                                                                                                                                                               | 78 |
| 4.5  | General signal flow diagram of the proposed $1^{st}$ -order NS SAR<br>ADC assuming $C_1 = C_3 = C$ , $C_2 = a/(1-a)C$ , and the inte-<br>gration path gain of $a$ .                                                            | 79 |
| 46   | Non-ideal effects in the proposed $1^{st}$ -order NS SAR ADC                                                                                                                                                                   | 80 |
| 4.7  | Chip die photo and lavout                                                                                                                                                                                                      | 81 |
| 4.8  | Power breakdown.                                                                                                                                                                                                               | 82 |
| 4.9  | Measured output spectra.                                                                                                                                                                                                       | 82 |
| 4.10 | Measured SNR/SNDR with different input amplitudes                                                                                                                                                                              | 84 |
| 4.11 | With different OSRs: (a) Measured SNDR and (b) Schreier FoM                                                                                                                                                                    | 84 |
| 4.12 | Proposed $2^{nd}$ -order NS SAR ADC architecture.                                                                                                                                                                              | 86 |
| 4.13 | General signal flow diagram of the proposed $1^{st}$ -order NS SAR<br>ADC assuming $C_1 = C_3 = C_4 = C$ , $C_2 = a/(1-a)C$ , the<br>$1^{st}$ integration path gain of $q_1$ and the $2^{nd}$ integration path gain of $q_2$ . | 87 |
| 4 14 | Non-ideal effects in the proposed $2^{nd}$ -order NS SAR ADC                                                                                                                                                                   | 88 |
| 4.15 | Power breakdown.                                                                                                                                                                                                               | 90 |
| 4.16 | Simulated output spectra                                                                                                                                                                                                       | 90 |
| 4 17 | With different OSRs: (a) Simulated SNDR and (b) Schreier FoM                                                                                                                                                                   | 20 |
| 7,1/ | and Walden FoM.                                                                                                                                                                                                                | 91 |

# **Chapter 1**

## Introduction

Analog-to-digital converter (ADC) is an electronic integrated circuit which transforms a signal from analog to digital form. Since most real-world signals are analog, ADC is an essential component in modern electronic devices to provide a link between the analog world of transducers and the digital world of signal processing and data handling.

To define the performance of an ADC, there are two widely used figures of merit (FoM). One is the Walden FoM calculated as,

$$FoM_W = \frac{P}{2^{ENOB} \times 2 \times BW},\tag{1.1}$$

where P is the ADC power consumption, ENOB is the effective number of bits of an ADC, and BW is the signal bandwidth. The lower the FoM<sub>W</sub> is, the better performance the ADC has. The other is the Schreier FoM caculated as,

$$FoM_S = SNDR + 10\log_{10}\frac{BW}{P},\tag{1.2}$$

where SNDR measures the signal to noise and distortion ratio which can be translated into ENOB as,



Figure 1.1: ADC performance comparisons in terms of different architectures.

$$ENOB = \frac{SNDR - 1.76}{6.02}.$$
 (1.3)

Opposite to the  $FoM_W$ , a higher  $FoM_S$  means a better ADC performance.

Based on (1.1) and (1.2), with the ADC power doubled,  $FoM_W$  increases by 6 dB while  $FoM_S$  reduces by 3 dB. Therefore,  $FoM_W$  is more sensitive to power than  $FoM_S$ . For thermal-noise limited design whose power quadruply increases with SNR,  $FoM_S$  is a more fair metric than  $FoM_W$ . Although  $FoM_W$  and  $FoM_S$ have different considerations, both of them are determined by three factors, ADC power, ENOB and the signal bandwidth. In the literature, there are various types of ADC architectures giving different performances. The optimum choice of an ADC depends on the target application. [de la Rosa et al. [2015]] makes a summary on performance comparisons of state-of-the-art ADC works in terms of different archi-



Figure 1.2: SAR ADC architecture.

tectures, which is shown in Figure 1.1. As can be seen, successive approximation register (SAR) ADCs achieves the best power efficiency at low/medium resolution. The main reason is that most circuits of a SAR ADC are digital, making it very amenable to technology scaling. Therefore, with the feature sizes of CMOS devices scaled down, SAR ADCs are becoming more and more popular with all kinds of low-power applications such as wireless sensors and low energy radios [Verma and Chandrakasan [2007]; Harpe et al. [2011]].

As shown in Figure 1.2, a SAR ADC mainly consists of four blocks:

- 1. A sample and hold (S/H) circuit that acquires the input voltage  $V_{in}$ .
- 2. A switched-capacitor (SC) digital-to-analog converter (DAC) array that outputs internal DAC voltages  $V_{DAC}$  to successively approximate  $V_{in}$  in a binary search fashion.
- 3. A comparator that compares  $V_{in}$  with  $V_{DAC}$ .

4. A SAR logic circuit that stores the comparator results and feeds them back to the DAC array.

In real implementation, the S/H circuit can be embedded into the SC DAC array. Although the SAR ADC architecture is simple and power-efficient, it still faces with the following bottlenecks:

- 1. An *N*-bit SAR ADC generally requires *N* comparison cycles and the next cycle will not start until the current cycle finishes. Figure 1.3 shows an example of a 4-bit SAR conversion. Therefore, it is very difficult for a SAR ADC to reach both high-speed and high-resolution.
- 2. A SAR ADC requires all the sub-blocks to be as accurate as the target resolution. As the target resolution goes beyond 10-bit, its power efficiency quickly diminishes due to its tight requirement on comparator noise. Moreover, the exponentially increasing capacitor DAC array not only costs large chip area and power, but also makes it difficult to drive.

The goal of this dissertation is to propose novel techniques to address the issues above. Two main techniques are proposed here (Figure 1.1). One is to embed compressive sensing (CS) into a SAR ADC to reduce the ADC sampling rate below the Nyquist rate, thus reducing the speed and power requirements of a SAR ADC for a fixed signal bandwidth. CS is a recently emerging sampling paradigm for sparse signals [citeCandes2006,Donoho2006]. Different from the conventional Shannon-Nyquist theorem that requires the ADC sampling rate to be at least twice



Figure 1.3: 4-bit SAR conversion.

the signal bandwidth (Nyquist rate), CS states that the ADC sampling rate should be determined by the information rate of a signal rather than the maximum frequency. Therefore, for a sparse signal whose spectrum does not occupy the whole bandwidth, the ADC sampling rate can be greatly reduced. As a matter of fact, most natural signals are sparse in certain domain such as audio , image, and biology [Plumbley et al. [2010]; Elad and Aharon [2006]; Allstot et al. [2010]], implying that the proposed CS-SAR ADC can be extensively used in low-power sensing applications. Although CS has been exploited in prior works to reduce the ADC sampling rate, most of them require a dedicated CS encoder consisting of analog multipliers and active integrators, thus limiting the total power and area savings brought by CS [Kirolos et al. [2006]; Laska et al. [2007]; Ragheb et al. [2008]; Tropp et al. [2010]; Chen et al. [2011, 2012a]; Yoo et al. [2012a,b]; Gangopadhyay et al. [2014]; Mishali and Eldar [2010]; Mishali et al. [2011]]. By contrast, the proposed technique does not require a separate CS encoder, but directly embeds CS into a conventional SAR ADC with minor additive hardware cost. The other technique is to embed noise shaping (NS) into a SAR ADC to use a low-resolution ADC for a high ENOB. Although NS lays the foundation for conventional  $\Delta\Sigma$  ADCs, its realization in a conventional  $\Delta\Sigma$  ADC requires OTA-based integrators which are power hungry and scaling unfriendly [Kim et al. [2008]; Pun et al. [2007]]. This has motivated the development of voltage-controlled oscillators (VCO)-based  $\Delta\Sigma$  ADCs that use VCO as integrators [Rao et al. [2014]; Park and Perrott [2009]]. However, VCO performs a voltage-to-phase/frequency conversion, which is nonlinear and sensitive to process-voltage-temperature (PVT) variations. By contrast, the proposed NS-SAR ADC uses passive integrators which only require one switch and two capacitors. The NS performance relies on the matching between these two capacitors, and thus is immune to any PVT variation.

The common merits of the proposed two techniques are simple and fullypassive. Only minor modifications are made to a conventional SAR ADC so that the power efficiency of a SAR ADC is kept while its performance is highly improved. To validate the effectiveness of the proposed techniques, three chips are fabricated in a  $0.13\mu$ m CMOS process. Chip 1 is a single-channel CS-SAR ADC for lowpower wireless sensors. Chip 2 is an extensive work of Chip 1 that uses a single CS-SAR ADC for simultaneous four-channel signal acquisitions. Chip 3 is a 1<sup>st</sup>order NS-SAR ADC which can reach both high resolution and low power. It can also be easily extended to 2<sup>nd</sup>-order noise shaping, further improving the ADC performance. The next 3 chapters focus on the details of each chip, respectively. Chapter 5 concludes the dissertation.

## **Chapter 2**

## **Single-Channel Compressive Sensing SAR ADC**

### 2.1 Background

Wireless sensor network (WSN) technologies are becoming ubiquitous in modern society. Since 2011, the number of interconnected devices on the planet has overtaken the actual number of people [Gubbi et al. [2013]]. The proliferation of various sensing devices results in the generation of enormous amounts of data which need to be stored, processed and communicated in a highly efficient way.

In the past, nearly all the signal acquisition protocols are dictated by the Shannon-Nyquist sampling theorem: the sampling rate must be at least twice the signal bandwidth (Nyquist-rate). Nevertheless, the Shannon theorem is not an efficient way to capture a sparse signal, for the information rate of a sparse signal can be much smaller than suggested by its bandwidth. As a matter of fact, most natural signals are sparse or compressible in certain domain including audio [Plumbley et al. [2010]], image [Elad and Aharon [2006]], biology [Allstot et al. [2010]], and so on. In specific, audio signals generated by resonant systems mainly consist of a small number of frequency components, allowing a sparse representation in the frequency domain. Biological signals are typically concentrated in time, allowing a sparse representation either directly in the time domain or in the wavelet domain.



Figure 2.1: Sparse signal acquisition systems. (a) Nyquist-rate ADC with subsequent digital compression. (b) Feature extraction with subsequent low-rate ADC. (c) CS encoder with subsequent low-rate ADC. (d) Proposed CS SAR ADC.

To provide a more efficient sampling paradigm for these signals, a groundbreaking theory called Compressive Sensing (CS) was proposed by [Candes et al. [2006]] and [Donoho [2006]], stating that sparse signals can be precisely recovered from far fewer samples or measurements than the Nyquist-rate. This implies a potential of dramatically relaxing the requirements of speed, power, and memory in a sparse signal acquisition system.

CS operates very differently from conventional sparse signal acquisition techniques. As Fig. 2.1(a) shows, for general sparse-signal applications, data compression is usually conducted in the digital domain, which still requires a frond-end analog-to-digital converter (ADC) to run at least Nyquist-rate. For specific applications which are not interested in the entire signal information such as neural spike detection [Karkare et al. [2011]; Verma et al. [2010]], feature extraction techniques can be used to reduce the ADC sampling rate so that only enhanced features are sampled from the signal (Fig. 2.1(b)). However, this can only be applicationspecific and information loss highly depends on prior knowledge of the signal. In contrast, by directly correlating the signal with a small set of random waveforms through a CS encoder as shown in Fig. 2.1(c), CS is able to compress the signal into a small amount of random linear measurements without information loss. What is more remarkable is that this compression process is non-adaptive and may not need any prior knowledge on the signal at all. After a low-rate ADC digitizes the measurements, all that needed is to use numerical optimization methods to recover the full-length signal from the small amount of measurements. Since the required number of measurements is proven to be proportional to the information rate of the signal, data conversion in this scenario is usually called analog-to-information conversion (AIC) [Verhelst and Bahai [2015]]. Considering tight power constraint at wireless sensor nodes, the compressed data can be first stored locally and then recovered after connecting the sensor to a back-end digital signal processor (DSP) where the power constraint is relaxed. The data can also be wirelessly transmitted to cloud for recovery.

With the maturity of the CS theory, more and more circuit designers are attracted to bringing it into practical use and implementing it on actual hardware [Kirolos et al. [2006]; Laska et al. [2007]; Ragheb et al. [2008]; Tropp et al. [2010]; Chen et al. [2011, 2012a]; Yoo et al. [2012a,b]; Gangopadhyay et al. [2014]; Mishali and Eldar [2010]; Mishali et al. [2011]; Wakin et al. [2012]; Trakimas et al. [2013];

Guo et al. [2013, 2015]]. Although prior prototype works successfully reduce the sampling rate of the ADC, the implementation of the CS encoder usually involves active amplifiers for continuous-time integration (or low-pass filtering) [Kirolos et al. [2006]; Laska et al. [2007]; Ragheb et al. [2008]; Tropp et al. [2010]; Chen et al. [2011, 2012a]; Yoo et al. [2012a,b]; Gangopadhyay et al. [2014]; Mishali and Eldar [2010]; Mishali et al. [2011]]. Since the noise and headroom requirement for amplifiers are still constrained by the signal bandwidth, the total power savings at the front end are limited. Furthermore, most prior works also require the signal to go through multiple parallel paths each of which consists of a CS encoder and a ADC, and thus occupy a large area [Chen et al. [2011, 2012a]; Yoo et al. [2012a,b]; Gangopadhyay et al. [2014]; Mishali and Eldar [2010]; Mishali et al. [2011]]. [Chen et al. [2012b]] even argues that a Nyquist-rate ADC plus digital CS encoders is a more energy-efficient CS framework than multiple analog CS encoders plus low-rate ADCs for the reason that analog integrators can be replaced by digital accumulators. However, this deviates from the true purpose of CS, which is to only sense compressed data.

As a consequence, an efficient analog CS framework without analog integrators is indispensable to make the fullest use of CS. As shown in Fig. 2.1(d), different from previous architectures which separate CS encoders from ADCs, we propose to embed CS into a conventional successive approximation register (SAR) ADC architecture to configure a fully-passive CS SAR ADC. The CS SAR ADC realizes an analog CS framework in discrete time instead of continuous time. Therefore, continuous-time integration changes to discrete-time summation which can be easily accomplished on-the-fly by charge sharing in the switched-capacitor (SC) sampling network of a SAR ADC. The theoretical background behind the CS SAR ADC is a random demodulator architecture proposed by [Kirolos et al. [2006]]. Its advantage is that the signal only needs to pass through one-path CS encoder, which means a single CS SAR ADC is sufficient here, greatly saving area and power compared to prior multi-channel CS works. Although it is difficult for a single CS SAR ADC to run beyond GHz, it is a much more area-and power-efficient architecture for natural signal acquisition which is mainly at low/medium-speed.

This chapter is organized as follows. First, the CS theory is briefly introduced. Second, the mainstream CS frameworks are reviewed. Then we propose the CS SAR ADC architecture. To validate its effectiveness, a 12-bit 1MS/s CS SAR ADC is designed and fabricated in a  $0.13\mu$ m CMOS process. Its detailed circuit implementation is presented next. The chapter concludes with the chip measurement results and performance comparisons with state-of-the-art works.

### 2.2 Compressive Sensing Theory

Although CS involves various subdisciplines within the applied mathematical sciences [Candes and Wakin [2008]], this section intents to plainly review its three key concepts: sparsity which pertains to the signals of interest, incoherence which pertains to the sensing modality, and reconstruction which pertains to the signal recovery. To make the concepts simple, all the signals hereafter are denoted as discrete-time vectors.

#### 2.2.1 Sparsity

Suppose we have an input vector  $\vec{s} \in R^N$  which can be expanded over an  $N \times N$  orthonormal matrix,  $\Psi = [\vec{\psi_1}, \vec{\psi_2}, \cdots, \vec{\psi_N}]$  as follows,

$$\vec{s} = \Psi \vec{\alpha} = \sum_{n=1}^{N} \alpha_i \vec{\psi}_i, \qquad (2.1)$$

where  $\vec{\alpha} \in \mathbb{R}^N$  is the coefficient vector for  $\vec{s}$  in  $\Psi$  domain. When  $\vec{\alpha}$  only contains  $K \ll N$  non-zero entries,  $\vec{s}$  is defined as a K-sparse signal in  $\Psi$  domain. In a more practical case, if we keep the largest K entries in  $\vec{\alpha}$  and zero the rest N - K entries to make a vector of  $\vec{\alpha}_K$ ,  $\vec{s}$  can be approximated as a K-sparse signal,  $\vec{s}_K = \Psi \vec{\alpha}_K$  when  $\|\vec{s} - \vec{s}_K\|_{\ell_2}$  is negligible. As mentioned previously, the knowledge of  $\Psi$  depends on the target signals. For example, if the signal is sparse in the time/frequency/wavelet domain,  $\Psi$  can be an Identity/Inverse Discrete Fourier Transform (IDFT)/Inverse Discrete Wavelet Transform (IDWT) matrix.

#### 2.2.2 Incoherence

To compress  $\vec{s}$  into a small number of measurements  $\vec{r} \in R^M$ , a  $M \times N$  sensing matrix,  $\Phi = [\vec{\phi}_1; \vec{\phi}_2; \cdots; \vec{\phi}_M]$  is needed for a projection as follows,

$$\vec{y} = \Phi \vec{s} = \Phi \Psi \vec{\alpha}. \tag{2.2}$$

Since the locations of the key information in  $\vec{\alpha}$  is unknown, this projection should ensure that  $\vec{y}$  keeps all the information from  $\vec{\alpha}$ . To make it possible, the sensing matrix  $\Phi$  is required to have a low coherence with the sparse representation matrix  $\Psi$ . The definition of their coherence is as follows,

$$\mu(\Phi, \Psi) = \sqrt{N} \max_{m,n} \left| \left\langle \vec{\phi}_m, \vec{\psi}_n \right\rangle \right|, \qquad (2.3)$$

where  $1 \le m \le M$  and  $1 \le n \le N$ . As can be seen from (2.3), the coherence measures the largest correlation between any row vector of  $\Phi$  and any column vector of  $\Psi$ . The more correlated  $\Phi$  and  $\Psi$  are, the larger  $\mu(\Phi, \Psi)$  is. It can be derived that  $\mu \in [1, \sqrt{N}]$ .

CS requires a low-coherence pair to make sure of its robustness. Intuitively speaking, the sensing matrix  $\Phi$  is to project  $\vec{s}$  into a domain where  $\vec{s}$  is not sparse to spread out all the key information. If  $\Phi$  is maximally incoherent with  $\Psi$ , all the entries in  $A = \Phi \Psi$  will have the same amplitude as  $1/\sqrt{N}$ . Therefore, each measurement in  $\vec{y}$  will contain equally contributed information from all the entries of  $\vec{\alpha}$ . The key information in  $\vec{\alpha}$  is totally spread out and no information is missed. If  $\vec{\phi}_m$  is correlated with  $\vec{\psi}_n$ , the amplitude of  $A_{m,n}$  will be larger than other entries. Therefore, the  $m^{th}$  measurement  $y_m$  will carry a larger weight from the  $n^{th}$  entry  $\alpha_n$ , which may cause a misinterpretation if  $\alpha_n$  is not the key information.

Fortunately, the selection of  $\Phi$  is not a difficult problem in reality. It has been proven that random matrices with independent identically distributed (i.i.d.) entries such as Gaussian or  $\pm 1$  binary entries, exhibit a very low coherence with any fixed representation matrix such as IDFT and IDWT [Candes et al. [2006]; Donoho [2006]]. The i.i.d  $\pm 1$  binary matrix (Bernoulli matrix) attracts the most attention of the research community, for the generation of pseudo-random  $\pm 1$  binary sequences (PRBS) and the corresponding projection can be easily implemented on hardware.

#### 2.2.3 Reconstruction

After acquiring the measurements  $\vec{y}$ , the next step is to recover the input signal  $\vec{s}$  based on (2.2). Since M < N, (2.2) is an under-determined system which can have many solutions. However, with the knowledge that  $\vec{\alpha}$  is sparse, there is a high probability that the sparsest solution is the correct solution. Once  $\vec{\alpha}$  is solved,  $\vec{s}$  can be recovered using (2.1). A common method to find a sparse solution to an under-determined system is the traditional  $\ell_1$  minimization method which can be summarized as a convex optimization problem:

$$\min_{\vec{\alpha}\in R^N} \|\vec{\alpha}\|_{\ell_1} \quad \text{subject to} \quad \vec{y} = \Phi \Psi \vec{\alpha}.$$
(2.4)

In real implementation,  $\vec{y}$  is the output of an ADC which contains quantization noise and thermal noise. For a noisy  $\vec{y}$ , (2.4) is modified to a problem with relaxed constraints:

$$\min_{\vec{\alpha}\in R^N} \|\vec{\alpha}\|_{\ell_1} \quad \text{subject to} \quad \|\vec{y} - \Phi\Psi\vec{\alpha}\|_{\ell_2} \le \epsilon.$$
(2.5)

Problem (2.5) is often called the LASSO after [Tibshirani [1994]]. Provided that  $\Phi$  is a random matrix with i.i.d. entries such as Gaussian or  $\pm 1$  binary entries, an exact solution to (2.5) with overwhelming probability requires,

$$M \ge CK \log(N/K),\tag{2.6}$$

where C is some constant related with  $\mu(\Phi, \Psi)$ . Many researchers have reported that  $M = 4 \times K$  is an empirical estimation. Since K represents the information rate of a sparse signal, this lower bound proves that in a CS framework the ADC sampling rate is determined by the information rate rather than the signal bandwidth. [Candes and Wakin [2008]] further shows that the solution  $\vec{\alpha}^*$  to (2.5) obeys,

$$\|\vec{\alpha}^* - \vec{\alpha}\|_{\ell_2} \le C_0 \|\vec{\alpha} - \vec{\alpha}_K\|_{\ell_1} / \sqrt{K} + C_1 \epsilon,$$
(2.7)

where  $C_0$  and  $C_1$  are some constants depending on each instance. As can be seen, the reconstruction error is bounded by the summation of two terms. The first term comes from the source itself and the second term comes from the measurement errors.

Besides  $\ell_1$  convex optimization approaches, greedy methods are another common class of algorithm to recover sparse solutions. Greedy methods generally have a lower computation complexity than  $\ell_1$  convex optimization at a sacrifice of robustness and accuracy [Kim et al. [2012]]. Since the implementation of the reconstruction block is beyond the scope of this paper, we implement it off-chip using MATLAB on a PC. A natural question to arise is that how complex the nonlinear reconstruction block is, and how much power and area are needed if implemented on-chip. This is still an active research area, and the answer depends on many factors such as the choice of the algorithm, the target performance, and the signal sparsity. Several research groups have shown that for a moderate post-reconstruction performance, the reconstruction block can be implemented on-chip with reasonable power consumption [Luo et al. [2012]; Xu et al. [2014]]. In addition, because of the purely digital nature of the reconstruction algorithm, the power and area cost will keep shrinking with process scaling. In general, CS can be viewed as an asymmetric compression scheme with economical encoding but relatively expensive recovery. Thus, CS is well suited for WSN applications, where front-end sensors are highly constrained in both energy and computational resources. Once the sensor signals are acquired and digitized, they can be saved or transmitted to a powerful back-end base station (or cloud) for digital signal processing without tight constraints on power and computational resources. The advantage for CS is that it automatically performs data compression at the front-end, and thus, its required data sensing and transmission rate is much lower than that for Nyquist-rate data acquisition.

#### 2.3 Compressive Sensing Frameworks

#### 2.3.1 State-of-The-Art

State-of-the-art CS frameworks can be mainly categorized into four classes: random demodulator (RD), random-modulation preintegrator (RMPI), modulated wideband converter (MWC), and non-uniform sampler (NUS). RD is the first proposed CS framework validated both in theory and hardware [Kirolos et al. [2006]; Laska et al. [2007]; Ragheb et al. [2008]; Tropp et al. [2010]]. As shown in Fig. 2.2(a), a random demodulator is composed of a mixer, a low-pass filter/integrator and a low-rate ADC. The basic principle is to demodulate the signal by multiplying it with a Nyquist-rate PRBS, which spreads the signal tones across the entire spectrum. Then the demodulated signal passes through a low-pass filter/integrator and a low-rate ADC samples it to capture the signal information. A prototype hardware of RD is implemented in [Ragheb et al. [2008]] by using discrete components



Figure 2.2: State-of-the-art CS frameworks and proposed CS framework. (a) Random demodulator. (b) Random-modulation preintegrator/modulated wideband converter. (c) Non-uniform sampler. (d) Proposed CS framework.

to build an analog Gilbert mixer and an active Gm-C based integrator. To further reduce the ADC sampling rate and introduce more randomization in the sensing matrix  $\Phi$ , the RMPI architecture is proposed which consists of a parallel of RDs driven by a common input (see Fig. 2.2(b)) [Chen et al. [2011, 2012a]; Yoo et al. [2012a,b]; Gangopadhyay et al. [2014]]. However, compared to RDs, RMPI not only consumes more area and power but also needs to address more issues such as synchronization among channels. MWC is another variant of RD, which has a very similar architecture to RMPI [Mishali and Eldar [2010]; Mishali et al. [2011]]. Nevertheless, MWC is used for blind acquisition of multi-band signals while RMPI deals with multi-tone signal settings. Therefore, signals are modeled and analyzed in a rather different fashion. A more detailed comparison can be referred to [Lexa et al. [2012]]. As shown in Fig. 2.2(c), by directly using a PRBS to control the sampling of an ADC, NUS avoids the analog multiplication and the active integration existing in previous CS frameworks [Wakin et al. [2012]; Trakimas et al. [2013]]. This makes NUS an extremely simple and power-efficient architecture for CS. However, NUS has a strict requirement on the input signal. If a signal contains any short-duration pulse, NUS may not be able to detect, thus limiting its use in many real applications.

#### 2.3.2 Proposed CS SAR ADC Architecture

Since the frequencies of most natural signals are in the order of kHz, the major constraints with WSN for natural signals lie in area and power consumption rather than speed requirement, meaning that RMPI and MWC are unnecessary. Plus that NUS may result in information loss of many natural signals, RD becomes the best choice for these applications. Different from conventional RD architectures that require analog mixers and active integrators before a low-rate ADC Kirolos et al. [2006]; Laska et al. [2007]; Ragheb et al. [2008]; Tropp et al. [2010], we propose a fully-passive CS framework that directly embeds random demodulation into a conventional SAR ADC. As shown in Fig. 2.2(d), the proposed CS SAR ADC operates in discrete time rather than continuous time so that the continuous-time integration is replaced by a discrete-time summation. In real implementation, both the multiplication and the summation are incorporated into the SC sampling



Figure 2.3: Circuit and timing diagram for the proposed 12-bit CS SAR ADC.

network of a SAR ADC. In other words, the CS SAR ADC is a fully-passive, simple and power-efficient hardware realization of a random demodulator.

Fig. 2.3 shows the circuit and timing diagram for a 12-bit CS SAR ADC architecture. Although a single-ended version is shown here, the real design is differential. There are two major differences between a CS SAR ADC and a conventional SAR ADC. One is that the input signal  $\vec{s}$  is multiplied with a PRBS  $\vec{p}$  to become a randomized result  $\vec{r}$  before being sampled. For a differential input signal, this multiplication is equivalent to changing the polarity of the signal which can be easily implemented by four switches. The other difference is that quantization does not happen after every-time sampling but only happens once every four-time sampling. The four sampling cycles are denoted as  $\phi_1 - \phi_4$  and the quantization cycle is denoted as  $\phi_5$  in Fig. 2.3.  $\phi_{1e} - \phi_{4e}$  are a bit earlier cycles of  $\phi_1 - \phi_4$  for bottom-plate

sampling. Since the sampling power is usually negligible in a SAR ADC design, four-time less quantization means that a CS SAR ADC can save the total power almost by four times compared to a conventional SAR ADC. As can be seen in Fig. 2.3, the SAR logic operates in an asynchronous fashion to further minimize quantization power [Chen and Brodersen [2006]; Yang et al. [2010]].

To explain the operation mechanism, the CS SAR capacitor array is divided into two segments: MSB which consists of  $C_1$ - $C_4$  and LSB which consists of  $C_5$ . Among them,  $C_4$  and  $C_5$  are a group representation of several capacitors for brevity of figures. One redundant capacitor of 32C is included in  $C_4$ , which helps absorb the digital-to-analog converter (DAC) settling error [Murmann [2013]] and facilitates foreground calibration on MSB capacitor mismatches [Guo et al. [2013]]. The largest capacitor 512C is halved into  $C_1$  and  $C_2$  so that  $C_1$ - $C_4$  are all equal to 256Cwhich can be used for four-time sampling with the same weight. Fig. 2.4 shows the DAC array configuration in  $\phi_1$ - $\phi_4$  and  $\phi_5$ , respectively. After multiplication between  $\vec{s}$  and  $\vec{r}$ , the CS SAR ADC operates as follows. The randomized result  $\vec{r}$  is sampled onto  $C_1$ - $C_4$  consecutively from  $\phi_1$  to  $\phi_4$  in a bottom-plate sampling fashion [Quinn and van Roermund [2007]]. Once  $\vec{r}$  is sampled, the value is held until  $\phi_5$  raises high. Then all four sampled values will be averaged on-the-fly and the asynchronous quantization starts. Note that  $C_5$  keeps sampling 0 during  $\phi_4$ , which only causes a minor attenuation on the average result.

Referring back to the CS theory, for an N-length input vector  $\vec{s}$ , the CS SAR ADC only outputs an M-length measurement vector  $\vec{y}$  herein M = N/4. The relationship between  $\vec{s}$  and  $\vec{y}$  forms an under-determined equation which is shown

$$\begin{bmatrix} y_1 \\ y_2 \end{bmatrix} = \begin{bmatrix} g_1p_1 & g_2p_2 & g_3p_3 & g_4p_4 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & g_1p_5 & g_2p_6 & g_3p_7 & g_4p_8 \end{bmatrix} \times \begin{bmatrix} s_1 \\ s_2 \\ s_3 \\ s_4 \\ s_5 \\ s_6 \\ s_7 \\ s_8 \end{bmatrix}.$$
(2.8)







Figure 2.4: DAC array configuration of the proposed 12-bit CS SAR ADC in: (a) the sampling cycles  $\phi_1$ - $\phi_4$  and (b) the quantization cycle  $\phi_5$ .
in (2.8) with N = 8 as an example.  $g_i$  represents every-time sampling weight which is equal to  $C_i / \sum_{j=1}^5 C_j$  where i = 1, 2, 3, 4. Without capacitor mismatch, all the  $g_i$ are equal. Otherwise, they can be calibrated along with capacitor mismatch.

# 2.4 Circuit Implementation and Analysis

The CS SAR ADC mainly consists of the following blocks: the clock generator, the mixer, the DAC array, the comparator and the asynchronous SAR logic. This section will introduce the detailed circuit design of each block one by one.

### 2.4.1 Clock Generator

As shown in Fig. 2.5, the master clk is time-interleaved to generate  $\phi_1$ - $\phi_4$ . A 4-length shifter register is initialized to [1,0,0,0] by a *reset* signal to start the phase counting. The shifter register is triggered by the negative edge of clk so that the duty cycle of clk can completely pass through the transmission gates controlled by the shifter register outputs  $Q_1$ - $Q_4$ . With a *mode* signal switching the shifter register input to  $V_{dd}$ , the CS SAR ADC can also work in a Nyquist-rate SAR mode for non-sparse signals. When all the flip-flops are loaded with 1, the phase counting stops and  $\phi_1$ - $\phi_4$  become the same as clk. Besides, the foreground calibration on DAC mismatches can also be conducted in this mode.

### 2.4.2 Mixer

The mixer is to multiply the signal with a PRBS which is essentially changing the polarity of the signal. Fig. 2.6(a) shows a direct passive implementation



Figure 2.5: Clock generator circuit diagram.

of the mixer using four switches. The problem with Fig. 2.6(a) is that the mixer is on the signal path to the whole DAC array. Large-size switches are required to ensure passing the signal with a high linearity. Fig. 2.6(b) proposes a better solution by combining the mixer switches with the sampling switches  $\phi_1$ - $\phi_4$  to make sure that each signal path only contains one switch. The additional hardware cost is two AND gates as shown in Fig. 2.6(b).

Although previous CS works [Chen et al. [2011, 2012a]] also use fourswitch passive mixers, they operate in the continuous-time domain. The power spectrum of a PRBS in the continuous-time domain is given by,



Figure 2.6: Passive mixer. (a) Direct implementation. (b) Improved implementation.

$$\overline{P^2(f)} = \frac{2}{f_s} \left| \operatorname{sinc}(\frac{\pi f}{f_s}) \right|^2, \qquad (2.9)$$

which is plotted in Fig. 2.7(a). As can be seen, the spectrum rolls off at high frequency, meaning that the signal tones at higher frequencies contribute less power to the randomized signal. Consequently, the reconstruction performance will become worse with increasing frequencies, which has been demonstrated in the chip measurement results of [Chen et al. [2012a]]. By contrast, in our CS SAR ADC, the mixer essentially operates in the discrete-time domain, for the multiplied result is sampled before being averaged. If a PRBS has good correlation properties, the spectrum of a PRBS in the discrete-time domain should be nearly white noise as shown in Fig. 2.7(b). Therefore, there is no performance degradation for highfrequency signals tones in our proposed CS SAR ADC architecture.

There are several classes of PRBSs which have been reported to have good correlation properties in the communication studies such as maximum length, Gold, Kasami, and Hadamard sequences [Kim et al. [2012]]. The maximum-length sequences (M-sequences) are most widely used in previous CS works due to its simple generation. Although our design uses an external FPGA to feed a 1024-length M-sequence to the chip, the PRBS generation can be easily integrated on chip with negligible power and area consumption. For example, a  $(2^m-1)$ -length M-sequence only needs m flip-flops to implement. Another straightforward approach is to use an on-chip look-up table or a memory to store the PRBS. Even though the PRBS generation can be a limiting factor to RMPI or MWC architectures due to their requirement for many different PRBSs, it is not a problem here. Similar as RD and NUS architectures, only one PRBS is required for a CS SAR ADC. the work of [Trakimas et al. [2013]] demonstrates that a 1024-length PRBS look-up table consumes less than 1/5 power dissipation and takes up 1/3 area of the core ADC.

### 2.4.3 DAC Array

By using the switching technique proposed in [Sanyal and Sun [2014]], the CS SAR ADC is able to use a 10-bit DAC array shown in Fig. 2.3 to realize 12-bit



Figure 2.7: Power spectrum of a PRBS in: (a) the continuous-time domain and (b) the discrete-time domain.

resolution, thus reducing the switching power by 4 times assuming the same unit capacitance. Fig. 2.8 shows a 1-bit DAC array example to give 3-bit resolution. As can be seen, both sides of a differential DAC array are charged to a initial sequence  $[gnd, V_{ref}]$ . during every comparison cycle, only one side will be switched to  $V_{ref}/gnd$ , which gives one more bit compared to the conventional SAR switching technique. The last unit capacitor uses  $V_{cm}$  as another reference voltage to get the 3rd-bit. The comparator input common-mode voltage  $V_{cmi}$  finally converges to  $V_{cm}$ , which obviates the need for a special comparator.



Figure 2.8: 1-bit DAC array for 3-bit resolution.

#### 2.4.4 Comparator

The comparator uses a strong arm-latch architecture as shown in Fig. 2.9. Since this architecture has no static biasing, the average power consumption is proportional to the conversion rate. Therefore, compared to the Nyquist-rate mode, the comparator power consumption is reduced by four times in the CS mode. Its simulated input referred noise  $\sigma$  is 258  $\mu$ V and the offset  $\sigma$  is 4.4 mV. Although the comparator offset is mostly static which will not affect the ADC linearity, it still needs to be compensated before reconstruction. Otherwise it is turned into large noise by the reconstruction process. This phenomenon can be explained as,

$$\vec{y} = \Phi \vec{s} + \vec{V}_{os} = \Phi(\vec{s} + \Phi^+ \vec{V}_{os}), \qquad (2.10)$$

where  $\Phi^+$  is the pseudo-inverse matrix of  $\Phi$  and  $\vec{V}_{os}$  is the comparator offset. Since  $\Phi$  is a random matrix,  $\Phi^+$  can be easily proved to be a random matrix and thus  $\Phi^+\vec{V}_{os}$  becomes a noise portion of the input signal  $\vec{s}$ . Nevertheless, the multiplication between  $\vec{p}$  and  $\vec{s}$  inherently paves the way for an offset digital background calibration mechanism. By taking the mean of the ADC result, the offset can be easily estimated as,

$$V_{os} = mean(\Phi\vec{s} + \vec{V}_{os}), \qquad (2.11)$$

which will be subtracted from the measurement results  $\vec{y}$  before reconstruction.



Figure 2.9: Comparator architecture.

#### 2.4.5 Asynchronous SAR Logic

An *N*-bit synchronous SAR ADC relies on dividing a master clock into a signal tracking phase and *N* conversion phases. Since a CS SAR ADC also needs to divide the master clock into  $\phi_1$ - $\phi_5$ , synchronous implementation becomes much more complex. Therefore, we implement the SAR logic in an asynchronous fashion as shown in Fig. 2.10. Once the comparator finishes making a decision, a rdy signal will be raised to trigger a sequencer which provides 13-phase clocks  $sclk_1$ - $sclk_{13}$ . In a conventional asynchronous SAR ADC, the sequencer usually drives SR latches, switching logic and temporary bit caches to store the internal comparison results [Chen and Brodersen [2006]; Yang et al. [2010]]. Differently, this work proposes to use strong-arm latches to store them, which greatly reduces the logic complexity. When  $sclk_i$  is low, both outputs of the  $i^{th}$  latch are reset to high. When  $sclk_i$  is



Figure 2.10: Asynchronous SAR logic architecture.

high, the  $i^{th}$  latch will make a decision based on current differential comparator outputs. Once the decision is made, the  $i^{th}$ -bit differential comparator results are stored in the  $i^{th}$  latch until  $sclk_i$  becomes low again. This operation manner exactly matches the switching scheme in Fig. 2.8. Therefore, strong-arm latches can be used here to directly drive the differential DAC. The rdy signal also triggers a delay line to self-clock the comparator. The delay line can be adjusted by external biasing voltages  $V_{bp}$  and  $V_{bn}$  to make sure that the DAC is completely settled before firing the comparator again.

## 2.5 Measurement Results

The proposed 12-bit CS SAR ADC is fabricated in a 0.13  $\mu$ m CMOS process, occupying an area of 0.2 mm<sup>2</sup>. Fig. 2.11 shows its die photo. The chip is designed at a power supply of 0.8 V and a sampling frequency of 1 MS/s. The total DAC capacitance is 2.1 pF × 2 with a unit capacitor of 2 fF. The DAC array is laid out in a segmented common-centroid way to minimize the capacitor mismatch due



Figure 2.11: Die photo of the fabricated CS SAR ADC.

to the parasitic capacitors of routing wires [Chen et al. [2014]]. The chip is tested in two modes: the Nyquist mode and the CS mode. In the Nyquist mode, the PRBS is set to be always 1. The SAR ADC itself performance is measured. In the CS mode, the PRBS is generated from an external FPGA and fed into the chip after level shifting. Both discrete-tone signals and real-world speech signals are used to demonstrate the CS performance. We conclude this section with a performance comparison between the CS SAR ADC work and state-of-the-art CS works and ADC works.

### 2.5.1 Nyquist-Mode Measurement Results

This section demonstrates the measurement results of the CS SAR ADC operating in the Nyquist mode. At 0.8 V supply and 1 MS/s, the Nyquist-mode ADC consumes 19.2  $\mu$ W power, whose detailed breakdown is shown in Fig. 2.12(a).



Figure 2.12: Power breakdown of the CS SAR ADC chip at 0.8 V and 1 MS/s in: (a) the Nyquist mode and (b) the CS mode.



Figure 2.13: Measured static performance: (a) DNL and (b) INL.



Figure 2.14: Measured output spectra with a -3 dBFS Nyquist-rate input.

As can be seen, 58% power comes from the digital portion, which can be greatly reduced with the technology scaling. To characterize the static performance, Fig. 2.13 shows the measured integral non-linearity (INL) and differential non-linearity (DNL) results. As can be seen, the worst INL and DNL errors are 1.6 LSB and 0.75 LSB, respectively. To characterize the dynamic performance, Fig. 2.14 shows the output spectra with a -3 dBFS Nyquist-rate input. The SNDR normalized to the full scale is 65.2 dB and the SFDR is 75.6 dB. The large harmonic tones are mainly due to the sampling switches with a low over-drive voltage at 0.8 V supply, which will diminish at low-frequency inputs. Fig. 2.15 further shows the measured SNDR/SFDR trends with different input frequencies and amplitudes. The peak SNDR is 65.5 dB. Combining all the metrics, the CS SAR ADC achieves a peak Walden figure-of-merit (FoM) of 12.5 fJ/conversion-step in the Nyquist mode.



Figure 2.15: Measured SNDR/SFDR trends with: (a) different input frequencies and (b) different input amplitudes.

### 2.5.2 CS-Mode Measurement Results

This section demonstrates the measurement results of the CS SAR ADC operating in the CS mode. Fig. 2.12(b) shows the CS-mode power breakdown at 0.8 V and 1 MS/s. Since the CS SAR ADC only quantizes once every four-time sampling, the effective conversion rate is 250 kS/s. As can be seen, the reference

power and the comparator power exactly scale by 4 times compared to the Nyquistmode power in Fig. 2.12(a). The digital power scales by a bit less than 4 times due to that the clock generation power does not scale. Therefore, the digital portion increases to 60% of the total power.

First, discrete-tone signals consisting of multiple sinusoidal waveforms at different frequencies are used to measure the maximum sparsity K the CS SAR ADC can deal with. Note that if the sparsity of a discrete-tone signal is K, it only contains K/2 different frequencies due to the symmetry of DFT. The length of the PRBS determines N which is equal to 1024. With a CR of 4, the number of measurements taken here is M=256. According to the empirical value mentioned above, the CS SAR ADC should recover a signal with at least K=64 by using convex optimization approaches. For comparison, we choose one convex optimization approach named as SL0 [Babaie-Zadeh [2010]] and one greedy method named as OMP [Tropp and Gilbert [2007]] for signal recovery. The post-reconstruction SNDR results are presented in Fig. 2.16. For each K value, K/2 frequencies are randomly generated 20 times and the post-reconstruction (PR) SNDR is averaged. Surprisingly, SL0 only shows a slight advantage over OMP, implying that greedy methods can be a more efficient choice for discrete-tone signals considering its low computation complexity. With 50 dB SNDR as a boundary, the maximum K that SL0 and OMP can recover is 84, which is beyond the empirical value 64. Defining the bandwidth occupancy of a signal as K/N, the CS SAR ADC design is able to compressively sense a signal whose bandwidth occupancy is smaller than 8.2%.

As examples, Fig. 2.17 shows the time and frequency domain comparisons



Figure 2.16: 20-time average post-reconstruction SNDR versus the sparsity 2K with different reconstruction algorithms.

of the input signals and the SL0-reconstructed signals with K/2 = 1 and K/2 = 12, respectively. When K/2 = 1, the post-reconstruction SNDR reaches the peak value of 61 dB (see Fig. 2.16). Compared to the Nyquist-mode, there is a 4.5 dB decrease on the peak SNDR. The reason is that the total signal power is reduced by around 6 dB due to the gain coefficients  $g_i$  in (2.8). However, the kT/C sampling noise will also be scaled by  $g_i$ . Therefore, the post-reconstruction SNDR degrades by 4.5 dB in total rather than 6 dB compared to the Nyquist-mode. According to (2.7), the reconstruction error consists of two terms: the source itself and the measurement errors. If the source error is dominated, the input signal power decrease will not degrade the reconstruction performance, for any source error will be also scaled by



Figure 2.17: Time and frequency domain comparisons of the discrete-tone signals (in black) and the corresponding SL0-reconstructed signals (in red) with: (a) K/2 = 1 (b) K/2 = 12.

 $g_i$  along with the input signal. This is actually the usual case for most natural signal acquisition systems. Combining all the metrics, the CS SAR ADC achieves a peak Walden figure-of-merit (FoM) of 5.5 fJ/conversion-step in the CS mode, which is

2.3 times better than the Nyquist mode.

Next, we will show the capability of the CS SAR ADC work to compressively sense natural sparse signals. We take a 1-second long speech signal as an example. At a sampling rate of 16 kHz, the total length of the speech signal is 16000. To reduce the computation complexity, the speech signal is divided into multiple 1024-length frames, each of which individually conducts the CS process. All the frames are 50% overlapped with each other and windowed to smooth the edges. The reconstruction results are demonstrated in Fig. 2.18. To evaluate the fidelity between the input signal  $\vec{s}$  and the reconstructed signal  $\vec{s}^*$ , we define the reconstruction signal-to-reconstruction error-ratio (SRER) as,

$$SRER(\vec{s}, \vec{s}^*) = 20\log_{10}(\frac{\|\vec{s}\|_{\ell_2}}{\|\vec{s} - \vec{s}^*\|_{\ell_2}}).$$
(2.12)

Based on (2.12), the SRER for the speech signal in Fig. 2.18 is 15.3 dB, which proves that the CS SAR ADC can also compressively sense natural sparse signals.

### 2.5.3 Chip Performance Comparison

Although many CS frameworks have been proposed in recent years, very few of them are actually implemented on chip. To our best knowledge, this work is the first fully-passive random demodulator implemented on chip. Table 2.1 summarizes its performance and compares it with state-of-the-art CS works with chip measurement results. Based on the RMPI architecture, [Chen et al. [2012a]] and [Gangopadhyay et al. [2014]] require multi-channel PRBS generations, analog multiplications and active integrations, leading to a much larger area and power consumption than this work. Although the NUS work in [Trakimas et al. [2013]] is also fully-passive, it is not applicable to signals that are sparse in the time domain. Besides, this work achieves the highest ENOB and the maximum sparsity with a much better FoM.

According to Murmann's ADC survey [Murmann [2015]], [van Elzakker et al. [2008]] sets the record for 1MS/s Nyquist-rate ADCs with a FoM of 4.4 fJ/conversion-step in 65 nm CMOS process. Since 60% power of the proposed CS SAR ADC is consumed by digital circuits, its FoM can be further improved with the technology scaling. Therefore, the CS SAR ADC is also very competitive to state-of-the-art low power ADC works. More importantly, the CS SAR ADC compresses the data by four times, saving the data storage/transmission power in the subsequent stages.



Figure 2.18: Comparisons of the 1-second long speech signal  $\vec{s}$  (in blue) and the corresponding SL0-reconstructed signal  $\vec{s}^*$  (in red) in: (a) the time domain and (b) the frequency domain. The error signal (in blue) is  $\vec{s} \cdot \vec{s}^*$  in the time domain.

| Design                                      | [Chen et al.        | [Gangopadhyay    | [Trakimas            | This               |
|---------------------------------------------|---------------------|------------------|----------------------|--------------------|
|                                             | [2012a]]            | et al. [2014]]   | et al. [2013]]       | work               |
| Architecture                                | RMPI                | RMPI             | NUS                  | RD                 |
| Need OTAs                                   | Yes                 | Yes              | No                   | No                 |
| On-chip ADC                                 | No                  | Yes              | Yes                  | Yes                |
| On-chip PRBS                                | Yes                 | Yes              | No                   | No                 |
| generator                                   |                     |                  |                      |                    |
| CMOS                                        | 90nm                | 130nm            | 90nm                 | 130nm              |
| Technology                                  |                     |                  |                      |                    |
| Supply                                      | 1V                  | 0.9V             | 0.9V                 | 0.8V               |
| No. of channels                             | 8                   | 64               | 1                    | 1                  |
| Area                                        | 0.92mm <sup>2</sup> | 6mm <sup>2</sup> | 0.15 mm <sup>2</sup> | 0.2mm <sup>2</sup> |
| Bandwidth                                   | 500MHz              | 1kHz             | 10MHz                | 500kHz             |
| Effective                                   | 360MS/s             | 0.5kS/s          | 5MS/s                | 250kS/s            |
| conversion rate                             | (CR=2.8)            | (CR=4)           | (CR=4)               | (CR=4)             |
| Maximum                                     | 4%                  | 5%               | 4%                   | 8.2%               |
| occupancy                                   |                     |                  |                      |                    |
| Resolution                                  | N/A                 | 10b              | 10b                  | 12b                |
| Peak PR SNDR                                | 40dB                | 40.6dB           | 43dB                 | 61dB               |
| (ENOB)                                      | (6.4b)              | (6.5b)           | (6.9b)               | (9.8b)             |
| Power                                       | 54mW                | $1.8\mu W$       | $175 \mu W$          | $5\mu W$           |
| Peak FoM                                    | 639fJ               | 9.9pJ            | 73.3fJ               | 5.5fJ              |
| [/conv-step]                                |                     |                  |                      |                    |
| $FoM = Power/(2^{ENOB} \times 2 \times BW)$ |                     |                  |                      |                    |

Table 2.1: Comparison with state-of-the-art CS works

# Chapter 3

# Multi-Channel Compressive Sensing SAR ADC

# 3.1 Background

Chapter 2 demonstrates that CS can be embedded into a single SAR ADC to reduce the sampling rate for a sparse signal, thus reducing the ADC power consumption. In many wireless sensing applications, multiple ADCs are commonly required to sense multi-channel signals. For instance, ECG diagnostics need multiple sensing electrodes on a patients skin to observe the heart muscle activity [Rincon et al. [2009]]. Another example is capsule endoscope which uses a RF sensor array to track the precise location of the capsule when the image is taken [Pourhomayoun et al. [2012]]. In brain machine interfaces (BMI), the number of channels can be up to 100 to observe the simultaneous activity of many neurons in specific regions of the brain [Harrison et al. [2007]].

Certainly, the proposed CS SAR ADC in Chapter 2 can be used to relax the

<sup>&</sup>lt;sup>0</sup>This chapter is a partial reprint of the publications: 1) Wenjuan Guo, Youngchun Kim, Ahmed Tewfik, and Nan Sun. Ultra-low power multichannel data conversion with a single SAR ADC for mobile sensing applications. In *IEEE Custom Integr. Circuits Conf.*, pages 1–4, Sept 2015. 2) Wenjuan Guo, Youngchun Kim, Arindam Sanyal, Ahmed Tewfik, and Nan Sun. A single SAR ADC converting multi-channel sparse signals. In *IEEE Int. Symp. Circuits Syst.*, pages 2235–2238, May 2013. I would like to thank Dr. Nan Sun and Dr. Ahmed Tewfik for their supervision on the publications. I would like to thank Dr. Youngchun Kim for his contribution on developing the reconstruction algorithms. I would like to thank Dr. Arindam Sanyal for his advice and helping me with layout.

power constraint for each ADC. However, except the power constraint, hardware cost is another important limitation for modern wireless sensors. To reduce the hardware cost, time-multiplexing and frequency-multiplexing are two conventional ways to perform multi-channel A/D conversion with fewer ADCs [Harrison et al. [2007]; Borna and Najafi [2014]]. In time multiplexing, all input channels are converted sequentially according to a multiplexing plan (Fig. 3.1(a)). However, when converting one channel, the information of other channels is not available. Therefore, the effective sampling rate per channel is given by the ADC sampling rate divided by the number of channels. This rate decreases as the number of channel increases, which is undesirable in energy efficient sampling. In frequency multiplexing, the solution involves modulating the analog signals so that they occupy non-overlapping frequency bands and digitizing the sum of the modulated signal (Fig. 3.1(b)). Its main drawback is that the required ADC sampling rate increases linearly with the number of channels leading to a large power consumption. Another alternative way is to modulate the multi-channel signals with an orthogonal matrix such as the Walsh-Hadamard matrix and use a single ADC to digitize the orthogonal superposition of the channels (Fig. 3.1(c)) [Majidzadeh et al. [2013]]. Since an orthogonal matrix is a square matrix, the single ADC still needs to sample the same amount of data from all the channels. Therefore, the ADC sampling rate is not relaxed and the power consumption is still high.

To address both power constraint and hardware constraint, we extend the single-channel CS SAR ADC in Chapter 1 to a multi-channel CS SAR ADC architecture. The proposed multi-channel CS SAR ADC uses a single SAR ADC



Figure 3.1: Conventional multi-channel A/D conversion with fewer ADCs: (a) time multiplexing, (b) frequency multiplexing, and (c) Walsh-Hadamard Coding.

sampled at the Nyquist rate of one channel to convert multi-channel sparse signals simultaneously. Compared to conventional multi-channel ADCs, the proposed architecture not only saves power and hardware cost, but also avoid such problems as timing skew, offset mismatch, and gain mismatch among channels.

Although in recent years CS has been actively exploited to reconstruct a

single-channel signal at sub-Nyquist rate, very few previous works cover its feasibility on multi-channel ADCs except the compressive multiplexer (CMUX) from [Slavinsky et al. [2011]]. Compared to CMUX, our architecture is more hardwareefficient, for we do not need additional resistor bank to do the averaging among channels but integrate everything into a single SAR ADC. To our best knowledge, our work is the first to demonstrate chip measurement results for CS based multichannel ADCs.

In many multi-channel data sensing systems such as multi-lead ECGs sensing, different-channel signals actually come from the same source, but are contaminated with different noises, or have different amplitudes and phases when measured at different locations. Therefore, all channel signals can be highly correlated. In this case, instead of reconstructing each channel signal separately, all channel signals can be reconstructed jointly, thus greatly reducing the computation complexity required by signal recovery. In this work, we investigate a joint greedy pursuit algorithm called Simultaneous Orthogonal Matching Pursuit (SOMP) [Tropp et al. [2005]]. Both discrete-tone signals and multi-lead ECG signals are used to verify the effectiveness of this algorithm in our proposed system. Besides reducing the computation complexity, chip measurement results prove that SOMP can also greatly improve the reconstruction performance of highly-correlated input signals compared to the conventional OMP [Tropp and Gilbert [2007]].

This chapter is organized as follows. First, the proposed multi-channel CS SAR ADC architecture is introduced. Second, the conventional OMP procedure is described to recover the input signals, laying a foundation for the explanation

of SOMP. The circuit implementation of a 12-bit 4-channel CS-based ADC is presented in next section. The chapter concludes with the chip measurement results and performance comparisons with prior single-channel CS works.

### **3.2 Proposed Architecture**

Fig. 3.2 gives the overall architecture of our proposed scheme. Initially, M-channel input signals are sampled and randomized after multiplying with M pseudo-random binary sequences of plus and minus ones (±1) (PRBS). Next, the sampled values of randomized inputs are averaged. The single ADC converts the average result to digital sequences, from which a reconstruction algorithm separates and restores all channels' input signals.



Figure 3.2: Proposed system architecture with M channels.

As the single-channel CS SAR ADC in Chapter 2, we effectively integrate the multiplication blocks, the sample and-hold (S/H) circuits, and the average block within a single SAR ADC architecture. The number of channels M can also be easily extended to an exponential value of 2 without additional hardware cost. Mathematically speaking, the  $m^{th}$  channel input signal,  $\vec{s}_m \epsilon R^N$  can be represented as (3.1) when expanded over the N-point discrete Fourier transform (DFT) basis,  $\Phi \epsilon C^{N \times N}$ .  $\Phi$  consists of N column vectors,  $\vec{\varphi}_n = \frac{1}{\sqrt{N}} \exp\{-j2\pi k(n-1)/N\}$ , where  $n = 1, 2, \dots, N$ , and  $k = 0, \pm 1, \dots, \pm \lceil N/2 - 1 \rceil$ .

$$\vec{s}_m = \Phi \vec{\alpha}_m = \sum_{n=1}^N \alpha_{m,n} \vec{\varphi}_n, \qquad (3.1)$$

where  $\alpha_{m,n}$  is the  $n^{th}$  coefficient of  $\vec{s}_m$  corresponding to  $\vec{\varphi}_n$ .

 $\vec{r}_m$  is the result of  $\vec{s}_m$  mixed with the  $m^{th}$  channel PN sequence,  $\vec{p}_m$ ,

$$\vec{r}_m = \vec{p}_m \otimes \vec{s}_m = \sum_{n=1}^N p_{m,n} s_{m,n},$$
 (3.2)

where  $p_{m,n}$  and  $s_{m,n}$  are the  $n^{th}$  coefficients of  $\vec{p}_m$  and  $\vec{s}_m$ , respectively.

After the multiplication, all the modulated signals are averaged to give  $\vec{y}$  as,

$$\vec{y} = \frac{1}{M} \sum_{m=1}^{M} \vec{r}_m.$$
 (3.3)

Plugging (3.1) and (3.2) into (3.3),  $\vec{s}$  can be rewritten as,

$$\vec{y} = \frac{1}{M} \sum_{m=1}^{M} \sum_{n=1}^{N} \alpha_{m,n} (\vec{p}_m \otimes \vec{\varphi_n}) = \sum_{m=1}^{M} \sum_{n=1}^{N} \alpha_{m,n} \vec{b}_{m,n},$$
(3.4)

where  $\vec{b}_{m,n} = \frac{1}{M} \vec{p}_m \otimes \vec{\varphi}_n$ . With  $\psi_m = \left[\vec{b}_{m,1}, \vec{b}_{m,2}, \cdots, \vec{b}_{m,N}\right], \Psi = [\psi_1, \psi_2, \cdots, \psi_M],$ and  $\vec{\alpha}^T = [\vec{\alpha}_1^T, \vec{\alpha}_2^T, \cdots, \vec{\alpha}_M^T]$ , (3.4) can be simplified to,

$$\vec{y} = \sum_{m=1}^{M} \psi_m \vec{\alpha}_m = \Psi \vec{\alpha}.$$
(3.5)

To make (3.5) easier to understand, Fig. 3.3 shows its graphic interpretation. As can be seen,  $\Psi$  is a known  $N \times W$  matrix, and  $\vec{\alpha}$  is an unknown  $W \times 1$  vector, where  $W = M \times N$ .  $\vec{y}$  is an  $N \times 1$  vector measured by the ADC. This forms an under-determined equation in a CS framework. To solve this equation, we can apply classical linear/convex optimization approaches, or greedy approaches [Kim et al. [2012]]. Once  $\vec{\alpha}$  is accurately estimated, the input signals can be reconstructed based on (3.1). The reconstructed signals are represented as  $\vec{s}_m^*$  in Fig. 3.2.



Figure 3.3: Under-determined equation in the proposed CS framework.

## **3.3 Introduction to OMP and SOMP**

As we mentioned in Chapter 2, the condition for CS is that  $\vec{\alpha}$  is a sparse vector. If  $\vec{\alpha}$  only contains K non-zero elements, the measurement vector  $\vec{y} = \Psi \vec{\alpha}$ can be regarded as a linear combination of K columns of  $\Psi$ . The idea behind OMP is to find the right K columns in a greedy fashion. The iteration begins by picking the column from  $\Psi$  that is the most correlated to  $\vec{y}$ . Then we subtract its contribution to  $\vec{y}$  and get a residual vector  $\vec{e}^4$ . The second iteration repeats the steps above on  $\vec{e}^4$ . It is hoped that the right set of columns can be found after K-time iterations. The detailed OMP procedures can be described as:

1. Initialization

The iteration count q = 0. The residual  $\vec{e}^0 = \vec{y}$ . The chosen column set  $\Psi^0 = \oslash$  and its corresponding index set  $\Lambda^0 = \oslash$ .

- 2. While the stopping criteria is not met
  - (a) q=q+1.
  - (b) Find the index of the column with the maximum correlation to the residual:

$$i^{q} = \operatorname*{argmax}_{j} \left| \left\langle \vec{b}_{j}, \vec{e}^{q-1} \right\rangle \right|, j \in \{1, 2, \cdots, W\} - \Lambda_{q-1}.$$

Note that  $\vec{b}_j$  is the  $j^{th}$  column of  $\Psi$ . Different from (3.4), here we use a one-dimensional index to denote the column location.

(c) Update the chosen column set and its corresponding index set:

$$\Psi^q = \left[\Psi^{q-1}, \vec{b}_{i^q}\right], \Lambda^q = \Lambda^{q-1} \cup i^q.$$

(d) Solve a least square problem to obtain a new signal estimation:

$$\vec{\alpha}^q = \operatorname*{argmin}_{\vec{\alpha}} \|\vec{y} - \Psi^q \vec{\alpha}\|_{\ell_2}.$$

(e) Subtract the contribution of the new approximation and calculate the new residual:

$$\vec{y}^q = \Psi^q \vec{\alpha}^q, \vec{e}^q = \vec{y} - \vec{y}^q.$$

- (f) Check the stopping criteria.
- 3. The final solution  $\vec{\alpha}^*$  consists of nonzero entries at the index set  $\Lambda^q$  whose values are equal to  $\vec{\alpha}^q$  and zero entries at the index set  $\{1, 2, \dots, W\} \Lambda^q$ .

Since OMP only finds one column in every iteration, the iteration time q should be at least larger than the maximum sparsity of the input signal  $K_{max}$ . With  $q = K_{max}$ , the computation complexity of the OMP algorithm will be  $O(K_{max}NW)$ , which is dominated by the Step 2.(b). According to [Tropp and Gilbert [2007]], the computation complexity of solving the  $\ell_1$  minimization problem in (2.4) can reach

 $O(N^2W^{3/2})$ . Since  $W \gg N$  and  $N \gg K_{max}$ , OMP has a much less computation complexity than convex optimization approaches, which is also a common merit possessed by most greedy methods. However, convex optimization approaches do not require a prior knowledge on  $K_{max}$  and usually give a more accurate and robust result.

When all channel signals are highly correlated, their spectrum tend to look similar, implying that  $\vec{\alpha}_1, \vec{\alpha}_2, \dots, \vec{\alpha}_M$  have non-zero entries at similar locations. In this case, SOMP is a more efficient algorithm than OMP by breaking  $\vec{\alpha}$  into  $\vec{\alpha}_1, \vec{\alpha}_2, \dots, \vec{\alpha}_M$  and finding a common index set indicating the locations of the nonzero entries in all of them. The detailed SOMP procedures can be described as:

1. Initialization

The iteration count q = 0. The residual  $\vec{e}^0 = \vec{y}$ . The common index set  $\Lambda^0 = \emptyset$  and its corresponding column set for each channel  $\psi_1^0 = \emptyset, \psi_2^0 = \emptyset, \cdots, \psi_M^0 = \emptyset$ .

- 2. While the stopping criteria is not met
  - (a) q=q+1.
  - (b) Find a common index for all the channels whose corresponding columns has the maximum sum correlation to the residual:

$$i^{q} = \operatorname*{argmax}_{j} \sum_{m=1}^{M} \left| \left\langle \vec{b}_{m,j}, \vec{e}^{q-1} \right\rangle \right|, j \in \{1, 2, \cdots, N\} - \Lambda_{q-1}.$$

Note that  $\vec{b}_{m,j}$  is the  $[(m-1) \times N + j]^{th}$  column of  $\Psi$ . Same as (3.4), here we use a two-dimensional index to denote the column location.

(c) Update the common index set and its corresponding column set for each channel:

$$\Lambda^{q} = \Lambda^{q-1} \cup i^{q},$$
  
$$\psi_{1}^{q} = \left[\psi_{1}^{q-1}, \vec{b}_{1,i^{q}}\right], \psi_{2}^{q} = \left[\psi_{2}^{q-1}, \vec{b}_{2,i^{q}}\right], \cdots, \psi_{M}^{q} = \left[\psi_{M}^{q-1}, \vec{b}_{M,i^{q}}\right].$$

(d) Combine all the channels' column sets and solve a least square problem to obtain a new signal estimation:

$$\Psi^q = [\psi_1^q, \psi_2^q, \cdots, \psi_M^q], \vec{\alpha}^q = \operatorname*{argmin}_{\vec{\alpha}} \|\vec{y} - \Psi^q \vec{\alpha}\|_{\ell_2}.$$

(e) Subtract the contribution of the new approximation and calculate the new residual:

$$\vec{y}^q = \Psi^q \vec{\alpha}^q, \vec{e}^q = \vec{y} - \vec{y}^q.$$

- (f) Check the stopping criteria.
- 3. The final solution  $\vec{\alpha}^{*T} = [\vec{\alpha}_1^{*T}, \vec{\alpha}_2^{*T}, \cdots, \vec{\alpha}_M^{*T}]$ . Each  $\vec{\alpha}_m^*$  consists of nonzero entries at the index set  $\Lambda^q$  whose values are equal to  $\vec{\alpha}_m^q$  and zero entries at the index set  $\{1, 2, \cdots, N\} \Lambda^q$ .

As can be seen, the major difference between SOMP and OMP occurs in Step 2.(b). By maximizing the sum of each channel's absolute correlations, SOMP targets to find a column set which contributes the most energy to as many of the input signals as possible [Tropp et al. [2005]]. Combining all the channels, Mcolumns are chosen simultaneously from  $\Psi$  in every iteration. Therefore, the required iteration time q can be reduced to  $K_{max}/M$ . In other words, SOMP reduces the computation complexity by M times compared to OMP.

## **3.4** Circuit Implementation

To verify our proposed architecture, a 12-bit SAR ADC capable of simultaneously converting 4-channel sparse signals was designed using a 0.13  $\mu$ m CMOS process. Although the actual design is fully-differential, a single-ended version is shown in Fig. 3.4 for simplicity.  $V_{refn}, V_{refp}$ , and  $V_{cm}$  represent the negative reference, positive reference, and common-mode voltage, respectively. In our design,  $V_{refn}$  is the ground voltage. Therefore, in some places of the manuscript, the positive reference voltage ( $V_{refp}$ ) is also described as the reference voltage ( $V_{ref}$ ).

In Fig. 3.4, each multiplication block is directly implemented using 4 switches, for the multiplication between a differential signal and a PRBS is equivalent to changing the polarity of the signal. Same as the single-channel CS SAR ADC in Chapter 2, the comparator is a strong arm-latch architecture, whose detail can be referred to Fig. 2.9). The major difference between the multi-channel CS SAR ADC and the single-channel CS SAR ADC lie in that the DAC array needs to simultaneously sample 4-channel input signals. This simultaneous operation also

simplifies the SAR logic, which can be easily implemented in a synchronous fashion. Besides, a novel low power switching technique is proposed, which not only preserves the 2-extra bit advantage from [Sanyal and Sun [2014]] but also avoids the common-mode voltage variation. This section will be mainly focused on these new techniques' implementation details.



Figure 3.4: Proposed circuit diagram of a CS-based 12-bit 4-channel SAR ADC. The switches controlled by the sampling clock,  $\phi_1$ , the calibration signal, cal, and the digital outputs, d < 11: 0 >, are labeled in green, blue and red, respectively.

### 3.4.1 Low Power Switching Without Common-Mode Voltage Variation

A novel low-power switching technique is proposed. Compared to conventional switching technique, the proposed technique is able to give 2 extra bits, thus reducing the switching power by 4 times assuming the same unit capacitance. Fig. 3.5 shows an example with 1-bit DAC array to give 3-bit resolution. The proposed technique builds upon the switching scheme in [Sanyal and Sun [2014]], but maintains a constant comparator common-mode voltage by first discharging the current-bit 2 capacitors to Gnd before launching the comparator (to  $V_{cm}$  for the last unit capacitor). With the proposed switching technique, a 10-bit DAC array is used here to achieve 12-bit resolution.



Figure 3.5: Proposed switching technique for a 3-bit SAR ADC.

### 3.4.2 DAC Arrangement

The linearity of a SAR ADC system is limited by the digital-to-analog (DAC) capacitor mismatch. To deal with this issue, the whole 10-bit DAC array is divided into two segments which we call the MSB DAC and the LSB DAC (see

Fig. 3.4). The error induced by the LSB DAC capacitor mismatch is less than the quantization noise so that it is negligible. Therefore, we only need to calibrate the MSB DAC capacitor mismatch. The LSB DAC are used to estimate the mismatch error from the MSB DAC. At the same time, the MSB DAC incorporates a redundant capacitor ( $2^4C$ ) whose value is equal to the summation of all the capacitors in the LSB DAC. This capacitor provides a sufficient redundancy required by the calibration. It also facilitates the sampling of the input signals, for they only need to be sampled on the MSB DAC. The LSB DAC is set to sample  $V_{cm}$ .

Fig. 3.6 shows the calibration steps for calibrating the mismatch between the largest capacitor  $(2^9C)$  and the rest MSB capacitors. For brevity, here we use -1, 1, and 0 to  $V_{refn}, V_{refp}$ , and  $V_{cm}$ , respectively. As can be seen,  $\{1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1\}$  are firstly sampled on the MSB DAC. After sampling, we force the MSB DAC connected to  $\{-1,1,1,1,1,1,1\}$ . If there is no mismatch for this capacitor, the comparator's input voltage should be 0. If there exists mismatch, the comparator's input voltage is non-zero, which is further converted to a digital sequence by the LSB DAC. Based on the calibration equations in [Lee et al. [1984]], we similarly calibrate other capacitors in the MSB DAC.

### 3.4.3 4-Channel Sampling

In the sampling phase, all the inputs are sampled on the MSB DAC while the LSB DAC will sample  $V_{cm}$ . In order to sample 4 channels simultaneously, we halve the largest capacitor (2<sup>9</sup>C) and evenly distribute the MSB DAC to 4 channels, which is illustrated in Fig. 3.7. As can be seen, the largest capacitor in MSB (2<sup>9</sup>C)



Figure 3.6: Calibration steps for the largest capacitor  $2^9C$ .

is halved into two  $2^{8}C$  to sample the  $1^{st}$  and  $2^{nd}$  channel. The next  $2^{8}C$ samples the  $3^{rd}$  channel and the rest MSB capacitors sample the  $4^{th}$  channel. The number of channels can also be extended to any exponential value of 2 by further halving more capacitors. When the bottom-plate sampling switch is closed, the 4-channel averaging is completed on-the-fly, avoiding the use of integrators required by many single-channel CS works [Chen et al. [2011, 2012a]; Yoo et al. [2012a,b]].



Figure 3.7: DAC configuration for 4-channel sampling.

By effectively integrating all the blocks into a single SAR ADC, we not only save power, signal bandwidth and space, but also avoid such problems as timing
skew, offset mismatch, and gain mismatch among channels. The reason for no timing skew is that all the channels samplings are controlled by one bottom-plate sampling switch. The use of one comparator ensures no offset mismatch. Since the gain mismatch is due to the capacitor mismatch, it can be compensated along with the capacitor calibration. More importantly, the overall ADC offset can also be easily calibrated by taking advantage of the inherit multiplication between the input and the PRBS. We only need to take the mean of the ADC result to get the offset value, for  $V_{os} = mean(\vec{p} \otimes \vec{s} + V_{os})$ . Removing offset is necessary to ensure reconstruction accuracy as otherwise it is turned into large noise by the PRBS.

### 3.4.4 Synchronous Clock Generation and SAR Logic

Different from the single-channel CS SAR ADC in Chapter 2 which requires 4 time-interleaving phases, the operation timing for a multi-channel CS SAR ADC is exactly the same as a conventional SAR ADC. All the channels are sampled simultaneously in the sampling phase  $\phi_1$ . After sampling, all the channels are averaged on-the-fly and conversion phases start. Therefore, the SAR logic can be easily implemented in a synchronous fashion. Fig. 3.8 shows the timing diagram for the 12-bit 4-channel CS SAR ADC operation. As can be seen, a 4-bit ripple counter is used to divide the master clock into 16 cycles. The 1<sup>st</sup> cycle is used for the sampling phase  $\phi_1$ . *lat* is a clock signal for the comparator. When it is high, the comparator starts to make a decision. When it is low, the comparator is reset. After sampling, the DAC array is redirected to an initial sequence {1,-1,-1,-1,-1,-1} (see Fig. 3.5) to be ready for the first-bit comparison. The 2<sup>nd</sup> cycle is used for the DAC settling to the initial sequence. The *lat* signal starts from the  $3^{rd}$  cycle and ends at the  $15^{th}$  cycle, producing 13 digital outputs, one of which is a redundant bit. The final cycle *Output* is used to store the digital outputs. Fig. 3.9 shows the synchronous SAR logic architecture. Compared to the asynchronous SAR logic architecture in Fig. 2.10, the major difference is that the comparator is clocked by a synchronous *lat* signal rather than self-clocked.



Figure 3.8: Proposed 12-bit 4-channel CS SAR ADC timing diagram.

## 3.5 Measurement Results

The proposed 12-bit 4-channel CS SAR ADC was fabricated in a 0.13  $\mu$ m CMOS process, occupying an area of 0.39 mm<sup>2</sup>. Fig. 3.10 shows its die photo and



Figure 3.9: Synchronous SAR logic architecture.

layout. The chip is designed at a power supply of 0.8 V and a sampling frequency of 1 MS/s. The total DAC capacitance is 2.1 pF  $\times$  2 with a unit capacitor of 2 fF. The chip is tested in two modes: the Nyquist mode and the CS mode. In the Nyquist mode, the PRBS is set to be always 1. The SAR ADC itself performance is measured. In the CS mode, the PRBS is generated from an external FPGA and fed into the chip after level shifting. To completely demonstrate the CS performance of the chip, two classes of input scenarios are investigated: 1) all the channels are independent and 2) all the channels are highly correlated. In scenario 1), both discrete-tone signals and real-world ECG signals are tested. The effectiveness of the SOMP algorithm is demonstrated. We conclude this section with a performance comparison between the 4-channel CS SAR ADC work and prior single-channel CS works.



Figure 3.10: Chip die photo and layout.

## 3.5.1 Nyquist-Mode Measurement Results

To verify the performance of the SAR ADC itself, we first do not apply any PRBS and set all  $\vec{p}_m$  to be 1. With all the channels sharing the same input signal, the 4-channel CS SAR ADC works the same as a conventional SAR ADC. Fig. 3.11(a) shows that the measured SFDR and SNDR are 76 dB and 66 dB respectively up to the Nyquist rate. Fig. 3.11(b) shows the SFDR & SNDR trend with different input amplitudes. To show the capability of the ADC to sense 4-channel input signals simultaneously, Fig. 3.12 shows the measured output spectra when inputting a 100.016 kHz, 200.016 kHz, 300.016 kHz, and 400.016 kHz -3 dBFS sinusoidal wave to each channel, respectively. Since the ADC output is the averaged value of the four channels, its spectra should contain 4 tones at these frequencies, each of which is around -15 dBFS. Operating at 0.8 V and 1 MS/s, the ADC consumes 34  $\mu$ W power in total. Fig. 3.13 shows its detailed power breakdown. As can be seen,

around 74% power comes from the digital portion, which can be greatly reduced by process scaling. Combining all the metrics, the ADC achieves a FoM of 20.4 fJ/conversion-step in the Nyquist-mode.

#### 3.5.2 CS-Mode Measurement Results

Next, we apply PRBS and test the chip in CS mode. Fig. 3.14 shows the block diagram of the test bench. The 4-channel input signals come from four arbitrary waveform generators (AWG). The 4-channel PRBSs come from the FPGA. A logic analyzer captures the chip digital outputs and send them to a PC where all channel signals are reconstructed. We have also recorded live demonstrations that show how we test the performance of the ADC and perform the reconstructions of single-tone signals, multi-tone signals, and music signals. The video can be found in the following link: http://youtu.be/AWraL5m-X9A.

#### **3.5.2.1** Discrete-Tone Signals

First, discrete-tone signals consisting of multiple sinusoidal waveforms at different frequencies are used to measure the maximum sparsity the CS SAR ADC can deal with. Since the CS SAR ADC has four channels, we define the total channel occupancy as  $\sum_{m=1}^{4} K_m/(N/2)$ , where  $K_m$  is the number of frequencies the  $m^{th}$ -channel signal has and N is the length of PRBS. Here we use 4-channel 512-length PRBSs. To simplify the testing, we let each channel possess the same number of frequencies. In scenario 1) when all-channel signals are independent, the frequency values of each channel signal are randomly generated. With differ-



Figure 3.11: Measured ADC performance without PRBS. (a) SFDR & SNDR vs. input frequency, and (b) SFDR & SNDR vs. input amplitude.

![](_page_78_Figure_0.jpeg)

Figure 3.12: Measured output spectra when inputting a 100.016 kHz, 200.016 kHz, 300.016 kHz, and 400.016 kHz -3 dBFS sinusoidal wave to each channel, respectively.

![](_page_78_Figure_2.jpeg)

Figure 3.13: Power breakdown.

![](_page_79_Picture_0.jpeg)

Figure 3.14: Test bench diagram.

ent channel occupancies and reconstruction algorithms, we measure the 4-channel average post-reconstruction SNDR. Two convex optimization methods (CVX and SL0) and two greedy algorithms (OMP and CoSaMP) are tested [Kim et al. [2012]]. For greedy methods,  $K_{max}$  for each channel is set to be 100. As shown in Fig. 3.15, CVX and SL0 achieve the peak SNDR of 66 dB and the max occupancy of 41% (26 tones per channel), respectively. Fig. 3.16 shows the measured time-domain and frequency-domain results of the input signals  $s_m$  and reconstructed signals  $*s_m$  via the SL0 method in the single-tone case and 26-tone case, respectively.

To demonstrate the effectiveness of the SOMP algorithm, we also test scenario 2) when all channel signals are highly correlated. To simplify the testing, we let all the channels share the same signal. With different channel occupancies, the measured 4-channel average post-reconstruction SNDR is shown in Fig. 3.17. As can be seen, for highly-correlated input signals, SOMP shows much better performance than OMP, achieving the max occupancy of 33% (21 tones per channel).

![](_page_80_Figure_0.jpeg)

Figure 3.15: Measured post-reconstruction SNDR with different total channel occupancies and reconstruction algorithms when all channel signals are independent.

## 3.5.2.2 Real-World Sparse Signals

Besides discrete-tone signals, we also test the capability of the chip to convert real-world sparse signals. First we use the 3 Frank lead ECG results from the PTB diagnostic ECG database. Sampled at 1 kS/s, the 3-lead ECG signals are fed into three input channels of the chip. Since multi-lead ECG signals are highly correlated, we use the SOMP algorithm to reconstruct them. Fig. 3.18 shows the reconstitution results in time domain. Fig. 3.19 shows the reconstitution results in frequency domain. Note that  $K_{max}$  for each channel is set to be 100 in the SOMP method. Therefore, in the half DFT spectrum, the reconstructed signals \* $s_m$  only contain 50 non-zero entries, which matches relatively well with the largest 50 entries of the input signals' spectrum. According to the equation (2.12), the SRERs

![](_page_81_Figure_0.jpeg)

Figure 3.16: Measured time-domain (left) and frequency-domain (right) results of the input signals  $s_m$  (blue) and reconstructed signals  $*s_m$  (red) via the SL0 method in the single-tone case (upper) and 26-tone case (lower).

for each channel are 12.5 dB, 12.8 dB and 9.3 dB.

To investigate the scenario when all channel signals are independent, we also test 4-channel 1s-long speech signals from different sources. At a sampling rate of 16 kHz, the total length of 1-channel speech signal is 16000. To reduce the computation complexity, the ADC output is divided into multiple 512-length frames, each of which individually conducts the CS process. All the frames are

![](_page_82_Figure_0.jpeg)

Figure 3.17: Measured post-reconstruction SNDR with OMP and SOMP when all channel signals are highly correlated.

50% overlapped with each other and windowed to smooth the edges. Fig. 3.20 demonstrates their reconstruction results via the SL0 method. Except weak signals beyond 4 kHz buried in the noise floor,  $\hat{s}_m$  match well with  $s_m$ . According to the equation (2.12), the SRERs for each channel are 14.5 dB, 12.1 dB, 14.3 dB and 14.1 dB.

Since the proposed ADC can simultaneously convert 4 channels, its power per channel is only 1/4 of the total power, leading to an effective FoM per channel of 5.1 fJ/conversion-step. This 4-time power saving is enabled by CS. Table 3.1 summarizes the chip performance. Since our work is the first to demonstrate chip measurement results for CS based multi-channel ADCs, Table 3.1 makes a comparison with prior single-channel CS works. As can be seen, our work achieves

![](_page_83_Figure_0.jpeg)

Figure 3.18: Measured time-domain results of 3 Frank lead ECG signals via the SOMP method.  $s_m$  (blue) are the input signals,  $*s_m$  (red) are the reconstructed signals, and  $*s_m$ - $s_m$  (black) are their differences.

at least 20 dB higher post-reconstruction SNDR  $(SNDR_{PR})$  and 14-times better FoM. With the capability of converting multi-channel signals simultaneously, our work can also deal with a much higher bandwidth occupancy than other works.

![](_page_84_Figure_0.jpeg)

Figure 3.19: Measured frequency-domain results of 3 Frank lead ECG signals via the SOMP method.  $s_m$  (blue) are the input signals, and  $s_m$  (red) are the reconstructed signals.

![](_page_85_Figure_0.jpeg)

Figure 3.20: Measured time-domain (left) and frequency-domain (right) results of the 4-channel 1s-long speech signals via the SL0 method.  $s_m$  (blue) are the input signals, and  $\hat{s}_m$  (red) are the reconstructed signals.

| Design                                                       | [Gangopadhyay  | [Trakimas      | This work |  |  |
|--------------------------------------------------------------|----------------|----------------|-----------|--|--|
|                                                              | et al. [2014]] | et al. [2013]] |           |  |  |
| CMOS technology [nm]                                         | 130            | 90             | 130       |  |  |
| ADC performance                                              |                |                |           |  |  |
| Supply [V]                                                   | 0.9            | 0.9            | 0.8       |  |  |
| Sampling rate [MS/s]                                         | 0.002          | 5.5            | 1         |  |  |
| Resolution [bit]                                             | 10             | 10             | 12        |  |  |
| SNDR [dB]                                                    | 40.6           | 57.6           | 66        |  |  |
| ENOB [bit]                                                   | 6.5            | 9.3            | 10.7      |  |  |
| Post-reconstruction performance (Compression ratio (CR) = 4) |                |                |           |  |  |
| Peak SNDR <sub>PR</sub> [dB]                                 | 40.6           | 43             | 66        |  |  |
| Peak ENOB <sub>PR</sub> [bit]                                | 6.5            | 6.9            | 10.7      |  |  |
| Max occupancy                                                | 5%             | 4%             | 41%       |  |  |
| Power $[\mu W]$                                              | 1.8            | 175            | 34        |  |  |
| Area [mm <sup>2</sup> ]                                      | 6              | 0.15           | 0.39      |  |  |
| FoM <sub>CS</sub> [fJ/step]                                  | 9900           | 73.3           | 5.1       |  |  |
| $FoM_{CS} = Power/2^{ENOB_{PR}}/f_s/CR$                      |                |                |           |  |  |

Table 3.1: Comparison with state-of-the-art CS works

# **Chapter 4**

# **Noise Shaping SAR ADC**

## 4.1 Background

For medium-resolution applications, SAR ADC is of great popularity due to its high power efficiency in nanometer technology. However, as the target resolution goes beyond 10-bit, its efficiency quickly diminishes due to its tight requirement on comparator noise. Moreover, the exponentially increasing capacitor DAC array not only costs large chip area and power, but also makes it difficult to drive. For high-resolution application,  $\Delta\Sigma$  ADC is a more widely-used architecture. Taking advantage of oversampling and noise shaping, it can use a low-resolution quantizer to reach high resolution. Nevertheless, it requires OTAs which are power hungry and scaling unfriendly.

Recently, there have been emerging efforts in the research community trying to develop hybrid ADC architectures that combine the merits of SAR and  $\Delta\Sigma$  ADCs [Fredenburg and Flynn [2012]; Chen et al. [2015]]. The first noise-shaping (NS) SAR ADC is published in [Fredenburg and Flynn [2012]], whose architecture is shown in Fig. 4.1. As can be seen, it still needs an active OTA-based integrator to realize a 1<sup>st</sup>-order noise transfer function (NTF) zero at 0.64. It also requires a FIR DAC that introduces extra noise and increases chip area. Later, a fully-passive 1<sup>st</sup>-order NS SAR ADC is published in [Chen et al. [2015]]. As shown in Fig. 4.2, it obviates the need for any OTA, but its noise-shaping performance is very limited, as its NTF zero is located at 0.5 rather than 1. Moreover, its input signal is attenuated by 2 times during normal conversion, leading to the 6-dB penalty in SNR or quadrupled analog power for the same SNR. In addition, it requires 2-time more capacitance, increasing chip area.

![](_page_88_Figure_1.jpeg)

Figure 4.1: Noise shaping SAR ADC architecture proposed in [Fredenburg and Flynn [2012]].

By contrast, this chapter proposes a novel NS SAR architecture that is simple, robust, and low power. Similar to [Chen et al. [2015]], it does not require any OTA, and it realizes a NTF zero at 0.75, which is the closest to 1 and achieves the best noise shaping result compared to [Fredenburg and Flynn [2012]] (z = 0.64) and [Chen et al. [2015]] (z = 0.5). Fig .4.3 compares the NTF magnitude with zeros at 0.5, 0.64 and 0.75. As can be seen, z = 0.75 achieves around 3 dB more in-band attenuation than z = 0.64 and 6 dB more in-band attenuation than z = 0.5.

![](_page_89_Figure_0.jpeg)

Charge conservation at  $V_{res}$  node:  $V_{in}[k] + V_{res}[k-1] = D_{out}[k] + 2V_{res}[k]$ Comparator noise:  $V_{res}[k] + V_{res}[k-1]/2 = Q[k]$ z domain transfer function:  $D_{out}(z) = V_{in}(z) + \frac{2(1-z^{-1}/2)}{1+z^{-1}/2}Q(z)$ 

Figure 4.2: Noise shaping SAR ADC architecture proposed in [Chen et al. [2015]].

Furthermore, the proposed architecture does not cause any signal attenuation, and requires less capacitance than [Chen et al. [2015]]. More importantly, it can be easily extended to  $2^{nd}$ -order noise shaping. To our best knowledge, our work is the first to propose a fully-passive  $2^{nd}$ -order NS SAR ADC architecture in the literature. The proposed NS SAR ADC altogether shapes the quantization noise, comparator noise, and DAC noise with minimum modification to the original SAR ADC. It allows the use of a low-resolution DAC and relaxes the requirement on comparator noise, making it possible to reach high-resolution and high-power efficiency simultaneously.

This chapter is organized as follows. First, the proposed NS SAR ADC

![](_page_90_Figure_0.jpeg)

Figure 4.3: NTF magnitude comparisons with zeros at 0.5, 0.64 and 0.75.

architecture for  $1^{st}$ -order noise shaping is introduced and analyzed. Second, with the design fabricated in a 0.13  $\mu$ m CMOS process, the chip measurement results are demonstrated. Third, the proposed  $1^{st}$ -order NS SAR ADC is extended to  $2^{nd}$ order. To validate its effectiveness, a  $2^{nd}$ -order NS SAR ADC is designed in a 40 nm CMOS process. The chapter concludes with its SPICE simulation results.

## 4.2 Proposed 1<sup>st</sup>-Order NS SAR ADC Architecture

Fig. 4.4 shows the architecture of the proposed  $1^{st}$ -order NS SAR ADC. Compared to conventional SAR operation, two more clock cycles,  $\phi_{ns_0}$  and  $\phi_{ns_1}$  are added. Before  $\phi_{ns_0}$  cycle, the SAR ADC does the normal conversion. Different from [Chen et al. [2015]], there is no capacitor connected to  $V_{res}$  node during

normal conversion, and thus, the signal attenuation problem is avoided. To realize  $1^{st}$ -order noise shaping, the key is to integrate the residual voltage  $V_{res}$  and feed it back to the comparator input. During  $\phi_{ns_0}$  cycle, a small capacitor,  $C_2 = C/3$  is merged with the DAC capacitor,  $C_1 = C$ , to get the residue voltage,  $V_{res}$ . At the end of  $\phi_{ns_0}$  cycle,  $C_2$  will carry  $0.75V_{res}$ . In the following  $\phi_{ns_1}$  cycle,  $C_2$  dumps its charge onto another capacitor,  $C_3 = C$ , effectively realizing a passive integration. The voltage integrated on  $C_3$  is labelled as  $V_{int}$ , which is fed back to the comparator input. Now the comparator has 2-path inputs, one of which is connected to  $V_{res}$ while the other is connected to  $V_{int}$ . However, there is a limitation with passive integration that only a fraction of  $V_{res}$  is integrated, which degrades the noise shaping performance. It seems that OTAs are still required to provide a gain to compensate the attenuation of  $V_{res}$ . Fortunately, as the comparators result is a 1-bit sign, what is required here is only a relative gain between  $V_{int}$  and  $V_{res}$ , which can be realized by simply sizing the comparator input transistors correspondingly. As shown in Fig. 4.4, to provide a gain of 4 on  $V_{int}$  path for a proper NTF, we size its corresponding input transistors 4 times larger than the  $V_{res}$  path. The cost is that the total noise from the comparator input pair increases by about 4 times when referred to the  $V_{res}$ path. Fortunately, the in-band comparator noise will be significantly attenuated due to noise shaping. After  $\phi_{ns_1}$  cycle, the charge on  $C_2$  is cleared in next  $\phi_s$  cycle to be ready for getting the new residual voltage. In real implementation, a mode signal is used to pull down Vint to ground so that the SAR ADC can be easily reconfigured to the conventional mode in case of Nyquist-rate applications. Additionally, foreground calibration on DAC mismatch can also be conducted in the Nyquist mode.

![](_page_92_Figure_0.jpeg)

Figure 4.4: Proposed 1<sup>st</sup>-order NS SAR ADC architecture.

To provide a better understanding of the proposed NS SAR architecture, Fig. 4.5 shows the general signal flow diagram assuming  $C_1 = C_3 = C$ ,  $C_2 = a/(1-a)C$ , and the integration path gain of g. As can be seen from the derived NTF in  $D_{out}$  equation, there is a zero located at (1-a) and a pole located at (1-a)(1-ga). When g = 1/a, the pole is gone and only left is the zero. As mentioned earlier, in this design, we choose a = 1/4 and g = 4, giving a NTF of  $(1-0.75z^{-1})$ . We can also choose g > 4 to get a negative pole which helps improve NS performance. However, since the comparator noise (or power) increases with the value of g, the overall benefit is limited. With a = 1/4,  $C_2$  value is C/3and consequently only 4/3 times more capacitors are required for the 1<sup>st</sup>-order NS. Note that the NTF is completely set by component ratios a and g, and thus, is insensitive to PVT variations. To ensure stability, the pole needs to be within the unit circle. The stability condition is shown in Fig. 4.5. Given that the current stability condition is 4/3 < g < 28/3, g = 4 determined by the comparator input transistor ratio is very far from the unstable boundary. Therefore, the proposed NS SAR architecture is highly robust.

![](_page_93_Figure_1.jpeg)

Figure 4.5: General signal flow diagram of the proposed  $1^{st}$ -order NS SAR ADC assuming  $C_1 = C_3 = C$ ,  $C_2 = a/(1-a)C$ , and the integration path gain of g.

With g = 1/a, Fig. 4.6 further investigates the non-ideal effects including thermal noises and DAC mismatch errors in the flow.  $n_1$  is the kT/C sampling noise which directly adds to the input signal.  $n_2$  is the noise voltage on  $C_2$  at the end of  $\phi_{ns_0}$  while  $n_3$  is the noise voltage on  $C_3$  at the end of  $\phi_{ns_1}$ . Fig. 4.6 also shows the noise power for  $n_2$  and  $n_3$ . With a = 1/4,  $\overline{n^2} = 9kT/4C$  and  $\overline{n^3} = kT/4C$ . As shown in the  $D_{out}$  equation,  $n_1$  and  $n_2$  directly pass through without being shaped, but  $n_3$  is 1st-order shaped. The comparator noise  $n_4$ , the DAC noise  $n_5$ , and the quantization noise Q are added at the same location, altogether shaped to the  $1^{st}$ order.

![](_page_94_Figure_0.jpeg)

Figure 4.6: Non-ideal effects in the proposed 1<sup>st</sup>-order NS SAR ADC.

Another interesting merit of the proposed NS SAR ADC is its simplified digital DAC mismatch calibration. For conventional multi-bit  $\Delta\Sigma$  ADCs, in order to completely remove the DAC mismatch error in the digital domain, we need to accurately extract not only the DAC mismatch percentage but also the DAC mismatch error transfer function (ETF), as the ETF may not be exactly 1 due to PVT variations. As a result, special techniques such as inserting a binary pseudo-random test signal [Kauffman et al. [2014]] are required to measure the ETF. By contrast, the ETF in the NS SAR ADC is always 1 for any NTF under any PVT variation. The key reason is that the quantizer and the feedback DAC use the same capacitor array in a NS SAR ADC. It is different from conventional multi-bit  $\Delta\Sigma$  ADCs whose DAC and quantizer are unrelated. As shown in Fig. 4.6,  $\varepsilon_1$  represents the quantizer error due to capacitor mismatch, and  $\varepsilon_2$  represents the feedback mismatch error. Since they are from the same origin in the NS SAR, it is easy to derive that  $\varepsilon_2(z)=-\varepsilon_1(z)$ . As a result, the ETF is 1 regardless of the values of a and g (see the

![](_page_95_Figure_0.jpeg)

Figure 4.7: Chip die photo and layout.

eqn. in Fig. 4.6). Even though there exist capacitor mismatches, it is equivalent to a NS SAR that uses a non-binary DAC array. As long as the capacitor mismatches are estimated, we can fully remove them in the digital domain. In this design, we reconfigure the NS ADC in the conventional Nyquist SAR mode at first and apply classic foreground calibration techniques to [Lee et al. [1984]] estimate the DAC mismatch errors.

# 4.3 Chip Measurement Results for The Proposed 1<sup>st</sup>-Order NS SAR ADC

As a proof of concept, a prototype  $1^{st}$ -order NS SAR ADC is fabricated in a 0.13  $\mu$ m CMOS process. Fig. 4.7) shows it die photo and layout. The core area is 0.13 mm<sup>2</sup>. The DAC array is 10-bit with a total capacitance of 2.1 pF×2 and a

![](_page_96_Figure_0.jpeg)

Figure 4.8: Power breakdown.

![](_page_96_Figure_2.jpeg)

Figure 4.9: Measured output spectra.

unit capacitance of 2 fF. The sampling frequency is 2 MS/s. At 1.2 V supply, the chip consumes 60  $\mu$ W power, 63% of which comes from the digital portion. Fig. 4.8 shows the detail power break down. Fig. 4.9 shows the measured output spectra with a 95.37KHz, -2dBFS sinusoidal input. At an OSR of 8, SNDR and SFDR are 74dB and 95dB, respectively. Fig. 4.10 shows the measured SNR/SNDR with different input amplitudes. Fig. 4.11 shows the measured SNDR/Schreier FoM (FoM<sub>S</sub>) trend with different OSRs. As shown in Fig. 4.11(a), with OSR doubled, SNDR increases by 6dB which matches the NTF of  $(1 - 0.75z^{-1})$ . Therefore, according to the equation (1.2), the  $FoM_S$  increases by 3dB with OSR doubled. As shown in Fig. 4.11(b), when OSR is 8, the chip achieves a FoM<sub>S</sub> of 167 dB.

Table 4.1 summarizes the chip performance and compares it with previous NS SAR ADC works. As can be seen, this work reaches the highest ENOB and the best Schreier FoM in an older process. Since the proposed NS SAR architecture is nearly as simple as a conventional SAR ADC, its power will be greatly reduced with the CMOS scaling.

![](_page_98_Figure_0.jpeg)

Figure 4.10: Measured SNR/SNDR with different input amplitudes.

![](_page_98_Figure_2.jpeg)

Figure 4.11: With different OSRs: (a) Measured SNDR and (b) Schreier FoM.

| Design                                | [Fredenburg and | [Chen et al. | This work |  |  |
|---------------------------------------|-----------------|--------------|-----------|--|--|
|                                       | Flynn [2012]]   | [2015]]      |           |  |  |
| Technology (nm)                       | 65              | 65           | 130       |  |  |
| Supply (V)                            | 1.2             | 0.8          | 1.2       |  |  |
| Resolution (bit)                      | 8               | 8            | 10        |  |  |
| Sampling rate (MS/s)                  | 90              | 50           | 2         |  |  |
| OSR                                   | 4               | 4            | 8         |  |  |
| Bandwidth (MHz)                       | 11              | 6.25         | 0.125     |  |  |
| Power ( $\mu$ W)                      | 806             | 120.7        | 61        |  |  |
| SNDR (dB)                             | 62              | 58           | 74        |  |  |
| ENOB (bit)                            | 10              | 9.35         | 12        |  |  |
| $FoM_{S}$ (dB)                        | 163             | 165          | 167       |  |  |
| $FoM_S = SNDR + 10log_{10}(BW/Power)$ |                 |              |           |  |  |

Table 4.1: Comparison with state-of-the-art CS works

## 4.4 Proposed 2<sup>nd</sup>-Order NS SAR ADC Architecture

The proposed 1<sup>st</sup>-order NS SAR ADC architecture can also be extended to 2<sup>nd</sup>-order, whose architecture is shown in Fig .4.12. As can be seen, compared to 1<sup>st</sup>-order NS SAR ADC, 2<sup>nd</sup>-order NS SAR ADC requires one more integrating capacitor  $C_4 = C$ , one more 16× input path to the comparator and one more noise shaping cycle  $\phi_{ns_2}$ . After the small capacitor  $C_2$  dumps  $V_{res}$  onto  $C_3$  during  $\phi_{ns_1}$ ,  $C_2$  further dumps the new voltage  $V_{int_1}$  onto  $C_4$ . The integrated voltage on  $C_4$  is labelled as  $V_{int_2}$ , which is connected to the 16X comparator input path. Unlike the 1<sup>st</sup>-order architecture in Fig. 4.4, we do not directly adding the noise shaping cycles  $\phi_{ns_1}$  and  $\phi_{ns_2}$  after  $\phi_{ns_0}$ . Instead,  $\phi_{ns_1}$  and  $\phi_{ns_2}$  are moved to the front of the whole period so that the SAR ADC speed is not slowed down. As can be seen,  $\phi_{ns_1}$  happens at the same time as the sampling cycle  $\phi_e$  and  $\phi_{ns_2}$  happens in the next

cycle which is also used for the initial DAC settling in a conventional SAR ADC. Referred to Fig. 3.8,  $\phi_{ns_0}$  can also be used for the *Output* cycle to store the digital outputs. Therefore, the 2<sup>nd</sup>-order NS SAR ADC can work as fast as a conventional SAR ADC.

![](_page_100_Figure_1.jpeg)

Figure 4.12: Proposed 2<sup>nd</sup>-order NS SAR ADC architecture.

Fig. 4.13 shows the general signal flow diagram assuming  $C_1 = C_3 = C_4 = C$ ,  $C_2 = a/(1-a)C$ , the 1<sup>st</sup> integration path gain of  $g_1$  and the 2<sup>nd</sup> integration path gain of  $g_2$ . Although  $D_{out}$  equation looks quite complicated now, with  $g_1 = 1/a$  and  $g_2 = 1/a^2$ ,  $D_{out}$  equation can be simplified to only contain two zeros located at the same location (1 - a). With a = 1/4, we can get  $g_1 = 4$  and  $g_2 = 16$ , giving a 2<sup>nd</sup>-order NTF of  $(1 - 0.75z^{-1})^2$ .

With  $g_1 = 4$  and  $g_2 = 16$ , Fig. 4.14 investigates the non-ideal effects

![](_page_101_Figure_0.jpeg)

Figure 4.13: General signal flow diagram of the proposed  $1^{st}$ -order NS SAR ADC assuming  $C_1 = C_3 = C_4 = C$ ,  $C_2 = a/(1-a)C$ , the  $1^{st}$  integration path gain of  $g_1$  and the  $2^{nd}$  integration path gain of  $g_2$ .

including thermal noises and DAC mismatch errors in the flow.  $n_1$  is the kT/Csampling noise which directly adds to the input signal.  $n_2$  is the noise voltage on  $C_2$  at the end of  $\phi_{ns_0}$  while  $n_3$  is the noise voltage on  $C_3$  at the end of  $\phi_{ns_1}$ .  $n_4$  is the noise voltage on  $C_2$  at the end of  $\phi_{ns_1}$  while  $n_5$  is the noise voltage on  $C_4$  at the end of  $\phi_{ns_2}$ . Fig. 4.14 also shows the noise power for  $n_2$ ,  $n_3$ ,  $n_4$ , and  $n_5$ . As shown in the  $D_{out}$  equation,  $n_1$  directly pass through without being shaped. One part of  $n_2$ is not shaped while the other part of  $n_2$  is  $1^{st}$ -order shaped with  $n_4$ .  $n_3$  and  $n_5$  are  $2^{nd}$ -order shaped. The comparator noise  $n_6$ , the DAC noise  $n_7$ , and the quantization noise Q are added at the same location, altogether shaped to the  $2^{nd}$ -order.

![](_page_102_Figure_0.jpeg)

Figure 4.14: Non-ideal effects in the proposed  $2^{nd}$ -order NS SAR ADC.

# 4.5 SPICE Simulation Results for The Proposed 2<sup>nd</sup>-Order NS SAR ADC

To validate its effectiveness, a prototype  $2^{nd}$ -order NS SAR ADC is designed in a 40 nm CMOS process. The DAC array is 9-bit. To reduce the in-band thermal noise and achieve a higher ENOB, we increase the unit capacitance value by around 4 times compared to the  $1^{st}$ -order NS SAR ADC design, giving a total capacitance of 4.1 pF×2. The sampling frequency is 10 MS/s. At 1.1 V supply, the design consumes 95  $\mu$ W power. Note that the digital power usually increases by 3-4 times after chip fabrication due to routing parasitic capacitances. For fair comparison, we increase the digital power by 4 times and show the detailed power break down in Fig. 4.15.

Fig. 4.16 shows the simulated 256-point DFT output spectra with a 117 kHz

 $(3/256 \times 10 \text{ MHz})$  full-scale sinusoidal input. At an OSR of 16, SNDR and SFDR are 87.7dB and 90.4dB, respectively. Fig. 4.17 shows the simulated SNDR/Schreier FoM (FoM<sub>S</sub>) /Walden FoM (FoM<sub>W</sub>) trend with different OSRs. As shown in Fig. 4.17(a), with OSR doubled, SNDR increases by 10dB which matches the NTF of  $(1 - 0.75z^{-1})^2$ . Fig. 4.17(b) shows that the chip achieves a FoM<sub>S</sub> of 181 dB and FoM<sub>W</sub> of 12.5 fJ/conversion-step.

Table 4.2 summarizes the design performance and compares it with the  $1^{st}$ order NS SAR ADC design and state-of-the-art  $\Delta\Sigma$  ADC work. As can be seen, by  $2^{nd}$ -order noise shaping, a 9-bit SAR ADC is able to achieve 14-bit ENOB at an OSR of 16. Compared to the  $1^{st}$ -order NS SAR ADC design, the  $2^{nd}$ -order design improves the FoM<sub>S</sub> by 14 dB and reduces the FoM<sub>W</sub> by 4.8 times. According to [de la Rosa et al. [2015]], [Sukumaran and Pavan [2014]] is one of the best  $\Delta\Sigma$ ADC works in the literature which gives the highest  $FoM_S$  of 182.3 dB. However, its ADC power is dominated by analog power, which is not amenable to the scaling. Based on (1.2), with the supply voltage doubled, assuming the same current the analog power doubles, reducing the  $FoM_S$  by 3 dB. At the same time, the signal power increases by 4 times, increasing SNDR by 6 dB. Therefore, a high power supply in an old process actually helps improve the  $FoM_S$ . Nevertheless, the  $FoM_W$  will not be improved, for the doubled power causes a 6 dB reduction rather than 3 dB. By contrast, our work achieves both high  $FoM_S$  and low  $FoM_W$ , proving that the proposed 2<sup>nd</sup>-order NS SAR ADC is able to reach high-resolution and high-power efficiency simultaneously. Besides, since the proposed design is almost simple and fast as a conventional SAR ADC, it also achieves a much wider bandwidth and

![](_page_104_Figure_0.jpeg)

occupies a much smaller area than [Sukumaran and Pavan [2014]].

Figure 4.15: Power breakdown.

![](_page_104_Figure_3.jpeg)

Figure 4.16: Simulated output spectra.

![](_page_105_Figure_0.jpeg)

Figure 4.17: With different OSRs: (a) Simulated SNDR and (b) Schreier FoM and Walden FoM.

| Design                                | 1 <sup>st</sup> -order NS | 2 <sup>nd</sup> -order NS | [Sukumaran and |  |  |
|---------------------------------------|---------------------------|---------------------------|----------------|--|--|
|                                       | SAR ADC                   | SAR ADC                   | Pavan [2014]]  |  |  |
| Technology (nm)                       | 130                       | 40                        | 180            |  |  |
| Supply (V)                            | 1.2                       | 1.1                       | 1.8            |  |  |
| Resolution (bit)                      | 10                        | 9                         | 1              |  |  |
| Sampling rate (MS/s)                  | 2                         | 10                        | 6.144          |  |  |
| OSR                                   | 8                         | 16                        | 128            |  |  |
| Bandwidth (kHz)                       | 125                       | 312.5                     | 24             |  |  |
| Power ( $\mu$ W)                      | 61                        | 155                       | 280            |  |  |
| SNDR (dB)                             | 74                        | 87.7                      | 98.2           |  |  |
| ENOB (bit)                            | 12                        | 14.3                      | 16             |  |  |
| $\operatorname{FoM}_{S}(\mathrm{dB})$ | 167                       | 181                       | 182.3          |  |  |
| $FoM_W$ (fJ/step)                     | 59.6                      | 12.5                      | 88             |  |  |
| $FoM_S = SNDR + 10log_{10}(BW/Power)$ |                           |                           |                |  |  |
| $FoM_W = Power/2^{ENOB}/2/BW$         |                           |                           |                |  |  |

Table 4.2: Performance summary and comparison

# Chapter 5

# Conclusion

This thesis mainly presents two techniques to improve the performance of a SAR ADC. One is compressive sensing (CS), which can be used to reduce the sampling rate and power of a SAR ADC. The other is noise shaping (NS), which can be used to increase the resolution of a SAR ADC.

To validate the effectiveness of CS, two chips are fabricated. Chip 1 is a single-channel CS SAR ADC architecture for low-power wireless sensors. Compared to previous CS works, the proposed CS SAR ADC does not require dedicated CS encoders and directly embeds random demodulation into a conventional SAR ADC. The circuit architecture only needs minor modification to a SAR ADC architecture and it can be easily reconfigured between the Nyquist mode and the CS mode. In the CS mode, the CS SAR ADC quantizes the average result of every four samples, thus reducing the ADC power by four times. To our best knowledge, this work provides the first fully-passive hardware realization of a random-demodulation-based CS framework. Chip 2 is a multi-channel CS SAR ADC capable of converting four-channel signals simultaneously at the Nyquist of one channel. The hardware cost is almost as simple as a single SAR ADC. When all channel signals are highly correlated, the signal reconstruction complexity can be greatly

reduced. To our best knowledge, this work is the first to demonstrate chip measurement results for CS based multi-channel ADCs. At 0.8 V and 1 MS/s, both chips achieve an effective Walden FoM of around 5 fJ/conversion-step in a 0.13  $\mu$ m CMOS process. The ADC power is dominated by the digital power, which can be further reduced with the technology scaling.

To validate the effectiveness of NS, a 1<sup>st</sup>-order NS SAR ADC is fabricated in a 0.13  $\mu$ m CMOS process. The proposed NS SAR ADC architecture is simple, robust and low power. Through a passive integrator and a 2-path comparator, quantization noise, comparator noise and DAC noise are shaped with a NTF of  $(1-0.75z^{-1})$ . Unlike conventional multi-bit  $\Delta\Sigma$ ADCs, both the NTF and the ETF of DAC mismatches are immune to PVT variations. At 1.2 V and 2 MS/s, the chip consumes 61  $\mu$ W power. SNDR increases by 6 dB with OSR doubled. At an OSR of 8, SNDR is 74dB and the Schreier FoM is 167 dB. The 1<sup>st</sup>-order NS SAR ADC can also be easily extended to  $2^{nd}$ -order noise shaping by adding one more passive integrator and one more path to the comparator. A 2<sup>nd</sup>-order NS SAR ADC is designed and simulated in a 40 nm CMOS process. To our best knowledge, this work is the first fully-passive 2<sup>nd</sup>-order NS SAR ADC in the literature. Considering 4-time increase of the digital power after fabrication, the design can achieve a Schreier FoM of 181 dB and a Walden FoM of 12.5 fJ/conversion-step within a bandwidth of 312.5 kHz, truly realizing high resolution and high power efficiency simultaneously. The design is to be fabricated in the near future.
Appendix

# **Appendix 1**

## **List of Publications**

#### **1.1 Patents**

1. Nan Sun and **Wenjuan Guo**. Fully-passive reconfigurable noise-shaping SAR ADCs. *Filed for U.S. provisional patent protection on 11/04/2015*.

#### **1.2 Published Papers**

- 1. Wenjuan Guo, Youngchun Kim, Ahmed Tewfik, and Nan Sun. Ultra-low power multichannel data conversion with a single SAR ADC for mobile sensing applications. In *IEEE Custom Integr. Circuits Conf.*, pages 14, Sept 2015.
- 2. Wenjuan Guo, Tsedeniya Abraham, Steven Chiang, Chintan Trehan, Masahiro Yoshioka, and Nan Sun. An area-and power-efficient  $I_{ref}$  compensation technique for voltage-mode R 2R DACs. *IEEE Trans. Circuits Syst. II, Exp. Briefs*, 62(7):656-660, Jul. 2015.
- Wenjuan Guo, Youngchun Kim, Arindam Sanyal, Ahmed Tewfik, and Nan Sun. A single SAR ADC converting multi-channel sparse signals. In *IEEE Int. Symp. Circuits Syst.*, pages 22352238, May 2013.
- 4. Youngchun Kim, **Wenjuan Guo**, B.Vikrham Gowreesunker, Nan Sun, and Ahmed H. Tewfik. Multi-channel sparse data conversion with a single analog-to-digital converter. *IEEE J. Emerg. Sel. Topics Circuits Syst.*, 2(3):470481, Sept. 2012.

### **1.3 Submitted Papers**

- 1. Wenjuan Guo and Nan Sun. A 9.8b-ENOB 5.5fJ/conversion-step fullypassive compressive sensing SAR ADC for low-power wireless sensors. *submitted to 2016 IEEE Symp. VLSI Circuits Dig.*.
- Wenjuan Guo, Gang Yuan and Nan Sun. A 12b-ENOB 61μW Noise-Shaping SAR ADC with a Passive Integrator. *submitted to 2016 IEEE Symp. VLSI Circuits Dig.*.

### **1.4 Papers in Preparation**

- 1. Wenjuan Guo and Nan Sun. A 9.8b-ENOB 5.5fJ/conversion-step fullypassive compressive sensing SAR ADC for low-power wireless sensors. (*Jour-nal*)
- 2. **Wenjuan Guo** and Youngchun Kim, Ahmed Tewfik, and Nan Sun. Ultra-low power multi-channel data conversion with a single SAR ADC for wireless sensing networks. (*Journal*)

#### 1.5 Miscellaneous

1. **Wenjuan Guo**, Youngchun Kim, Ahmed Tewfik, and Nan Sun, 2015 ISSCC Student Research Preview session.

### **Bibliography**

- J. Fredenburg and M. Flynn. A 90MS/s 11MHz bandwidth 62db SNDR noiseshaping SAR ADC. In *IEEE ISSCC Dig. Tech. Papers*, pages 468–470, Feb 2012.
- Z. Chen, M. Miyahara, and A. Matsuzawa. A 9.35-ENOB, 14.8 fj/conv.-step fullypassive noise-shaping SAR ADC. In *IEEE Symp. VLSI Circuits Dig.*, pages C64– C65, June 2015.
- J.M. de la Rosa, R. Schreier, K.-P. Pun, and S. Pavan. Next-generation delta-sigma converters: Trends and perspectives. *IEEE J. Emerg. Sel. Topics Circuits Syst.*, 5 (4):484–499, Dec 2015.
- N. Verma and A.P. Chandrakasan. An ultra low energy 12-bit rate-resolution scalable SAR ADC for wireless sensor nodes. *IEEE J. Solid-State Circuits*, 42(6): 1196–1205, June 2007.
- P.J.A. Harpe, C. Zhou, Yu Bi, N.P. van der Meijs, Xiaoyan Wang, K. Philips, G. Dolmans, and H. de Groot. A 26  $\mu$ w 8 bit 10 MS/s asynchronous SAR ADC for low energy radios. *IEEE J. Solid-State Circuits*, 46(7):1585–1595, July 2011.
- M.D. Plumbley, T. Blumensath, L. Daudet, R. Gribonval, and M.E. Davies. Sparse representations in audio and music: From coding to source separation. *Proc. IEEE*, 98(6):995–1005, Jun. 2010.
- M. Elad and M. Aharon. Image denoising via sparse and redundant representations over learned dictionaries. *IEEE Trans. Image Process.*, 15(12):3736–3745, Dec. 2006.
- E.G. Allstot, A.Y. Chen, A.M.R. Dixon, D. Gangopadhyay, and D.J. Allstot. Compressive sampling of ECG bio-signals: Quantization noise and sparsity considerations. In *IEEE Biomed. Circuits Syst. Conf.*, pages 41–44, Nov. 2010.

- S. Kirolos, J. Laska, M. Wakin, M. Duarte, D. Baron, T. Ragheb, Y. Massoud, and R. Baraniuk. Analog-to-information conversion via random demodulation. In *IEEE Dallas/CAS Workshop on Design Appl. Integr. Softw.*, pages 71–74, Oct. 2006.
- J.N. Laska, S. Kirolos, M.F. Duarte, T.S. Ragheb, R.G. Baraniuk, and Y. Massoud. Theory and implementation of an analog-to-information converter using random demodulation. In *IEEE Int. Symp. Circuits Syst.*, pages 1959–1962, May 2007.
- T. Ragheb, J.N. Laska, H. Nejati, S. Kirolos, R.G. Baraniuk, and Y. Massoud. A prototype hardware for random demodulation based compressive analog-todigital conversion. In *IEEE Midwest Symp. Circuits Syst.*, pages 37–40, Aug. 2008.
- J.A. Tropp, J.N. Laska, M.F. Duarte, J.K. Romberg, and R.G. Baraniuk. Beyond nyquist: Efficient sampling of sparse bandlimited signals. *IEEE Trans. Inf. The*ory, 56(1):520–544, Jan. 2010.
- X. Chen, Z. Yu, S. Hoyos, B.M. Sadler, and J. Silva-Martinez. A sub-nyquist rate sampling receiver exploiting compressive sensing. *IEEE Trans. Circuits Syst. I, Reg. Papers*, 58(3):507–520, Mar. 2011.
- X. Chen, E.A. Sobhy, Z. Yu, S. Hoyos, J. Silva-Martinez, S. Palermo, and B.M. Sadler. A sub-nyquist rate compressive sensing data acquisition front-end. *IEEE J. Emerg. Sel. Topics Circuits Syst.*, 2(3):542–551, Sept. 2012a.
- J. Yoo, S. Becker, M. Monge, M. Loh, E. Candes, and A. Emami-Neyestanak. Design and implementation of a fully integrated compressed-sensing signal acquisition system. In *IEEE Int. Conf. Acoust., Speech, Signal Process.*, pages 5325–5328, Mar. 2012a.
- J. Yoo, C. Turnes, E.B. Nakamura, C.K. Le, S. Becker, E.A. Sovero, M.B. Wakin, M.C. Grant, J. Romberg, A. Emami-Neyestanak, and E. Candes. A compressed sensing parameter extraction platform for radar pulse signal acquisition. *IEEE J. Emerg. Sel. Topics Circuits Syst.*, 2(3):626–638, Sept. 2012b.
- D. Gangopadhyay, E.G. Allstot, A.M.R. Dixon, K. Natarajan, S. Gupta, and D.J. Allstot. Compressed sensing analog front-end for bio-sensor applications. *IEEE J. Solid-State Circuits*, 49(2):426–438, Feb. 2014.

- M. Mishali and Y.C. Eldar. From theory to practice: Sub-nyquist sampling of sparse wideband analog signals. *IEEE J. Sel. Topics Signal Process.*, 4(2):375–391, Apr. 2010.
- M. Mishali, Y.C. Eldar, O. Dounaevsky, and E. Shoshan. Xampling: Analog to digital at sub-nyquist rates. *IET Circuits, Devices Syst.*, 5(1):8–20, Jan. 2011.
- M. Kim, G. Ahn, P.K. Hanumolu, Sang-Hyeon Lee, Sang-Ho Kim, Seung-Bin You, Jae-Whui Kim, G.C. Temes, and Un-Ku Moon. A 0.9 V 92 dB double-sampled switched-RC delta-sigma audio ADC. *IEEE J. Solid-State Circuits*, 43(5):1195– 1206, May 2008.
- K. Pun, S. Chatterjee, and P.R. Kinget. A 0.5-V 74-dB SNDR 25-kHz continuoustime delta-sigma modulator with a return-to-open DAC. *IEEE J. Solid-State Circuits*, 42(3):496–507, March 2007.
- S. Rao, K. Reddy, B. Young, and P.K. Hanumolu. A deterministic digital background calibration technique for VCO-based ADCs. *IEEE J. Solid-State Circuits*, 49(4):950–960, April 2014.
- M. Park and M.H. Perrott. A 78 db sndr 87 mw 20 mhz bandwidth continuoustime  $\delta\sigma$  ADC with VCO-based integrator and quantizer implemented in 0.13  $\mu$ m CMOS. *IEEE J. Solid-State Circuits*, 44(12):3344–3358, Dec 2009.
- J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami. Internet of Things (IoT): A vision, architectural elements, and future directions. *Future Generation Compu.* Syst., 29(7):1645 – 1660, 2013.
- E.J. Candes, J. Romberg, and T. Tao. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. *IEEE Trans. Inf. Theory*, 52(2):489–509, Feb. 2006.
- D.L. Donoho. Compressed sensing. *IEEE Trans. Inf. Theory*, 52(4):1289–1306, Apr. 2006.
- V. Karkare, S. Gibson, and D. Markovic. A 130-μW, 64-channel neural spikesorting DSP chip. *IEEE J. Solid-State Circuits*, 46(5):1214–1222, May 2011.

- N. Verma, A. Shoeb, J. Bohorquez, J. Dawson, J. Guttag, and A.P. Chandrakasan. A micro-power EEG acquisition SoC with integrated feature extraction processor for a chronic seizure detection system. *IEEE J. Solid-State Circuits*, 45(4):804– 816, Apr. 2010.
- M. Verhelst and A. Bahai. Where analog meets digital: Analog-to-information conversion and beyond. *IEEE Solid State Circuits Mag.*, 7(3):67–80, Sept. 2015.
- M. Wakin, S. Becker, E. Nakamura, M. Grant, E. Sovero, D. Ching, Juhwan Yoo, J. Romberg, A. Emami-Neyestanak, and E. Candes. A nonuniform sampler for wideband spectrally-sparse environments. *IEEE J. Emerg. Sel. Topics Circuits Syst.*, 2(3):516–529, Sept. 2012.
- M. Trakimas, R. D'Angelo, S. Aeron, T. Hancock, and S. Sonkusale. A compressed sensing analog-to-information converter with edge-triggered SAR ADC core. *IEEE Trans. Circuits Syst. I, Reg. Papers*, 60(5):1135–1148, May 2013.
- W. Guo, Y. Kim, A. Sanyal, A. Tewfik, and N. Sun. A single SAR ADC converting multi-channel sparse signals. In *IEEE Int. Symp. Circuits Syst.*, pages 2235–2238, May 2013.
- W. Guo, Y. Kim, A. Tewfik, and N. Sun. Ultra-low power multi-channel data conversion with a single SAR ADC for mobile sensing applications. In *IEEE Custom Integr. Circuits Conf.*, 2015.
- F. Chen, A.P. Chandrakasan, and V.M. Stojanovic. Design and analysis of a hardware-efficient compressed sensing architecture for data compression in wire-less sensors. *IEEE J. Solid-State Circuits*, 47(3):744–756, Mar. 2012b.
- E.J. Candes and M.B. Wakin. An introduction to compressive sampling. *IEEE Signal Process. Mag.*, 25(2):21–30, Mar. 2008.
- R. Tibshirani. Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Ser. B, 58:267–288, 1994.
- Y. Kim, W. Guo, B.V. Gowreesunker, N. Sun, and A.H. Tewfik. Multi-channel sparse data conversion with a single analog-to-digital converter. *IEEE J. Emerg. Sel. Topics Circuits Syst.*, 2(3):470–481, Sept. 2012.

- C. Luo, M.A. Borkar, A.J. Redfern, and J.H. McClellan. Compressive sensing for sparse touch detection on capacitive touch screens. *IEEE J. Emerg. Sel. Topics Circuits Syst.*, 2(3):639–648, Sept 2012. ISSN 2156-3357.
- J. Xu, E. Rohani, M. Rahman, and Gwan Choi. Signal reconstruction processor design for compressive sensing. In *IEEE Int. Symp. Circuits Syst.*, pages 2539– 2542, June 2014.
- M.A. Lexa, M.E. Davies, and J.S. Thompson. Reconciling compressive sampling systems for spectrally sparse continuous-time signals. *IEEE Trans. Signal Process.*, 60(1):155–171, Jan. 2012.
- S.-W.M. Chen and R.W. Brodersen. A 6-bit 600-MS/s 5.3-mW asynchronous ADC in 0.13-μm CMOS. *IEEE J. Solid-State Circuits*, 41(12):2669–2680, Dec. 2006.
- J. Yang, T.L. Naing, and R.W. Brodersen. A 1 GS/s 6 bit 6.7 mW successive approximation ADC using asynchronous processing. *IEEE J. Solid-State Circuits*, 45(8):1469–1478, Aug 2010.
- B. Murmann. On the use of redundancy in successive approximation A/D converters. In *Int. Conf. Sampling Theory Appl.*, Jul. 2013.
- Patrick J. Quinn and Arthur H.M. van Roermund. *Switched-Capacitor Techniques* for High-Accuracy Filter and ADC Design. Springer Publishing Company, Incorporated, 1st edition, 2007.
- A. Sanyal and N. Sun. An energy-efficient low frequency-dependence switching technique for sar adcs. *IEEE Trans. Circuits Syst. II, Exp. Briefs*, 61(5):294–298, May 2014.
- L. Chen, A. Sanyal, J. Ma, and N. Sun. A 24-µW 11-bit 1-Ms/s SAR ADC with a bidirectional single-side switching technique. In *European Solid-State Circuits Conf.*, pages 219–222, Sept. 2014.
- M. Babaie-Zadeh. Smoothed 10 (sl0) algorithm for sparse decomposition, 2010. URL http://ee.sharif.edu/~SLzero/.

- J.A. Tropp and A.C. Gilbert. Signal recovery from random measurements via orthogonal matching pursuit. *IEEE Trans. Inf. Theory*, 53(12):4655–4666, Dec 2007.
- B. Murmann. Adc performance survey 1997-2015, 2015. URL http://web.stanford.edu/~murmann/adcsurvey.html.
- M. van Elzakker, E. van Tuijl, P. Geraedts, D. Schinkel, E. Klumperink, and B. Nauta. A 1.9μw 4.4fJ/Conversion-step 10b 1MS/s charge-redistribution ADC. In *IEEE ISSCC Dig. Tech. Papers*, pages 244–610, Feb 2008.
- F. Rincon, N. Boichat, V. Barbero, N. Khaled, and D. Atienza. Multi-lead waveletbased ecg delineation on a wearable embedded sensor platform. In *Computers in Cardiology*, pages 289–292, Sept 2009.
- M. Pourhomayoun, M. Fowler, and Zhanpeng Jin. A novel method for medical implant in-body localization. In *IEEE Int. Conf. Eng. Med. Biol. Soc.*, pages 5757–5760, Aug 2012.
- R.R. Harrison, P.T. Watkins, R.J. Kier, R.O. Lovejoy, D.J. Black, B. Greger, and F. Solzbacher. A low-power integrated circuit for a wireless 100-electrode neural recording system. *IEEE J. Solid-State Circuits*, 42(1):123–133, Jan 2007.
- A. Borna and K. Najafi. A low power light weight wireless multichannel microsystem for reliable neural recording. *IEEE J. Solid-State Circuits*, 49(2):439–451, Feb 2014.
- V. Majidzadeh, A. Schmid, and Y. Leblebici. A 16-channel, 359  $\mu$ w, parallel neural recording system using Walsh-Hadamard coding. In *IEEE Custom Integr. Circuits Conf.*, pages 1–4, Sept 2013.
- J.P. Slavinsky, J.N. Laska, M.A. Davenport, and R.G. Baraniuk. The compressive multiplexer for multi-channel compressive sensing. In *IEEE Int. Conf. Acoust., Speech, Signal Process.*, pages 3980–3983, May 2011.
- J.A. Tropp, A.C. Gilbert, and M.J. Strauss. Simultaneous sparse approximation via greedy pursuit. In *IEEE Int. Conf. Acoust., Speech, Signal Process.*, volume 5, pages v/721–v/724, March 2005.

- H.S. Lee, D. Hodges, and P.R. Gray. A self-calibrating 15 bit CMOS A/D converter. *IEEE J. Solid-State Circuits*, 19(6):813–819, Dec 1984.
- J.G. Kauffman, P. Witte, M. Lehmann, J. Becker, Y. Manoli, and M. Ortmanns. A 72 db dr, CT  $\delta\sigma$  modulator using digitally estimated, auxiliary DAC linearization achieving 88 fj/conv-step in a 25 mhz bw. *IEEE J. Solid-State Circuits*, 49(2): 392–404, Feb 2014.
- A. Sukumaran and S. Pavan. Low power design techniques for single-bit audio continuous-time delta sigma ADCs using FIR feedback. *IEEE J. Solid-State Circuits*, 49(11):2515–2525, Nov 2014.

## Vita

Wenjuan Guo received the B.S. degree from Department of Microelectronics and Nanoelectronics at Tsinghua University in 2011. While at Tsinghua, she achieved excellent academic performance, winning several scholarships. She is now pursuing Ph.D. degree in the Department of Electrical and Computer Engineering at the University of Texas at Austin. She interned in DAC team of Texas Instruments from June 2013 to May 2014. She was awarded with Texas Instruments Fellowship twice from 2014 to 2016. Her current research is focused on data converter design combined with signal processing and bio-medical applications.

email address: wjguo@utexas.edu

This dissertation was typeset with  $L^{A}T_{E}X^{\dagger}$  by the author.

<sup>&</sup>lt;sup>†</sup>LAT<sub>E</sub>X is a document preparation system developed by Leslie Lamport as a special version of Donald Knuth's T<sub>E</sub>X Program.