Copyright by Linxiao Shen 2019 The Dissertation Committee for Linxiao Shen certifies that this is the approved version of the following dissertation:

# Design Techniques for Ultra-Low-Power Sensor Interface Circuits and Systems in Nano-Scale CMOS Technologies

Committee:

Nan Sun, Supervisor

Michael Orshansky

David Z. Pan

Nanshu Lu

Scott Willingham

# Design Techniques for Ultra-Low-Power Sensor Interface Circuits and Systems in Nano-Scale CMOS Technologies

by

Linxiao Shen,

#### DISSERTATION

Presented to the Faculty of the Graduate School of The University of Texas at Austin in Partial Fulfillment of the Requirements for the Degree of

### DOCTOR OF PHILOSOPHY

THE UNIVERSITY OF TEXAS AT AUSTIN

December 2019

Dedicated to my parents.

## Acknowledgments

First and foremost, I would like to express my sincere thanks to my advisor, Professor Nan Sun, for his generous guidance, support and excellent mentorship throughout my Ph.D. pursuit. He is a wise and patient leader and supervisor. He not only introduced the topic of biomedical sensor interface, power-efficient ADC/LNA design, and analog circuit automation to me and gave me valuable suggestions, but also provided me with many opportunities to expose myself in various projects and collaborations. He is also a knowledgeable person with a very rigorous attitude in my research. As the best leader and supervisor ever in my mind, he taught me by his words, his actions, his passions. He taught me how to find critical research problems, and how to tackle those challenging problems in smart, efficient, and elegant ways. He even helped me to refine my paper line-by-line. Through the countless discussions with him, my way of thinking becomes more and more professional. It is my greatest privilege to have him as my Ph.D. supervisor.

I would like to thank Professor Nanshu Lu for her generous and invaluable guidance on the start of my research on biomedical sensor interface circuits. She is always willing to help me with her detailed technical suggestions and guidance. It was a great pleasure to work with her and her group on the exciting stretchable patch-based biomedical measurement system/platform. I express great appreciations to other committee members for their help on this dissertation. I am fortunate to have received technical wisdom from Professors David Z. Pan. It was a great experience to work and discuss with his productive group and genius group members on the automation projects. I want also to thank Prof. Michael Orshansky and Dr. Scott Willingham for the helpful discussions and their comments on my dissertation.

My work and life at UT Austin have been benefited tremendously from intellectual and personal interactions with many people and they always gave me valuable suggestions at the correct time. When I first stepped onto the UT Austin and felt everything around to be new, Wenjuan Guo led me through. I can clearly remember that the valuable suggestions she gave me both on my Ph.D. journey and my life. I also want to express my thank to Sungjin Hong, who is not only a knowledgeable researcher, but also an amazing team leader. I will never forget that it is he who spent time on my first tapeout and design reviews. Moreover, he is so responsible that he served as the team captains for all my 180-nm tape-outs. Moreover, I would like to thank Abhishek Mukherjee for rigorous and technical discussions. The scenarios that only us fighting for the tape-out deadlines in the late night in EER become such beautiful memories that I will keep in mind. Last but not least, I would like to express my thank to Xiangxing Yang, Wei Shi, Wenda Zhao, Yi Shen and Zhelu Li for their broad knowledge. The discussions with them always bring me new angles and broaden my horizons.

Besides, I appreciate my colleagues during my internships. Dr. Scott

Willingham, Dr. Jinwen Xiao, Dr. Dazhi Wei, Dr. Srikanth Govindavajulu, and Gang Yuan. They gave me a lot of precious advice on working in both industry and academia.

I have been extremely fortunate to work in such a competitive research group and I would like to thank all of them: Shaolan Li, for his sharp intuition, broad knowledge and insightful discussions; Long Chen, Arindam Sanyal, Kareem Ragab, Jeonggoo Song, Yeonam Yoon for their generous help and unbelievably ability to build up so many prior works and useful experience for me to step on; Xiyuan Tang, Miguel Gandara, Chen-Kai Hsu, Ahmet Faruk Budak, Mantian Zhang, Sudeep Mishra, Ruocheng Wang, Ruicong Chen, Manzur Rahman, Ran Zheng, Xin Xin, Jiaxin Liu, Yi Zhong, Dengquan Li, Yanlong Zhang for their useful advises, assistance, support, and inspirations.

Very special thanks go to all my good friends (include but not limited to Xiaodan Xi, Yucen Fang, Yu Lu, Lingyuan Gao, Weikang Du, Zhaohe Dai, Jinsong Liu, Hao Liu, Keren Zhu, Ge Li, Yuxin Wang, Heechai Kang, Hyoyoung Jeong). They have deeply enriched my graduate school life with their friendship. Nothing could efface my memory on the time we have spent together in Gregory Gym, on the basketball court and the whole UT Campus.

Finally, my deepest gratitude goes to my parents and my whole family. Their unconditional love is the fundamental source of my happiness. Their support and encouragement accompany me whenever I am down. This thesis is dedicated to them. Portions of this work were supported by NSF, NIH, and the University Continuing Fellowship.

# Design Techniques for Ultra-Low-Power Sensor Interface Circuits and Systems in Nano-Scale CMOS Technologies

Publication No.

Linxiao Shen, Ph.D. The University of Texas at Austin, 2019

Supervisor: Nan Sun

In recent decades, the internet of things (IoT) has been sprout, resulting from the improvement of the circuit design and manufacturing techniques. Moreover, the emerging of 5G technologies further enhances its growth. Autonomous wireless sensors and their networks have been one of the most prevailing and important research topics for the past decades. Although researchers have been pushing the state-of-the-art of sensor readout to have higher and higher power and area efficiency, the results turn out to be insufficient to meet the modern requirements, especially considering the number of sensors is dramatically growing and a large portion of them are battery-less devices. Thus, maintaining a high resolution and low noise while achieving a high power and area efficiency has been one of the major challenges for sensor readout circuit designs in recent years. This thesis proposes several novel power- and area-saving techniques for the fundamental building blocks: 1) the inverter-stacking technique; and 2) the tail-less inverter-stacking technique for LNA; 3) the CT-SAR-assisted two-step SAR ADC with kT/C noise attenuated.

The first work presents a highly power-efficient amplifier. By stacking inverters and splitting the capacitor feedback network, the proposed amplifier achieves 6-time current reuse, thereby significantly boosting the transconductance and lowering noise but without increasing the current consumption. A novel biasing scheme is devised to ensure robust operation under 1 V supply. A prototype in 180 nm CMOS has 5.5  $\mu$ V<sub>rms</sub> noise within 10 kHz BW while consuming only 0.25  $\mu$ W power, leading to a noise efficiency factor (NEF) of 1.07, which is the best among reported amplifiers.

The second work presents a low-noise capacitively-coupled instrumentation amplifier, featuring the better-than-bipolar power efficiency. The tail-less structure removes the tail current source, reducing the supply voltage to be 0.6 V, and thus significantly reducing the power consumption. Compared with other recently reported front-end amplifiers, it achieves the best trade-off between power consumption and input-referred noise (IRN). AC-coupling and current mode biased are employed to enhance its PVT robustness.f In addition, several other design techniques are used, including AC coupling with optimized gain allocation-based ripple reduction, CM-pre-filtering based CMRR enhancement. The prototype fabricated in 180-nm CMOS process achieved an integrated input-referred rms noise of 1.38  $\mu V_{rms}$  within an 8-kHz bandwidth. With one global 0.6-V supply voltage, the prototype consumes 2.7- $\mu W$  of total power, leading to a PEF of 0.96. The peak CMRR and PSRR are measured to be 84 dB and 78 dB, respectively, which validates the performance enhancement techniques with the pseudo-differential input stage.

The third work presents a two-step analog-to-digital converter (ADC) that operates its 1st-stage successive approximation register (SAR) ADC in the continuous-time (CT) domain. It avoids the front-end sample-and-hold (S/H) circuit and its associated sampling noise. Hence, the proposed ADC allows the input capacitor size to be substantially reduced without incurring large sampling noise penalty. With input AC coupling, the 1st-stage CT SAR can simultaneously perform input tracking and SAR quantization. Its conversion error is minimized by accelerating the SAR speed and providing redundancy. A floating inverter-based (FIB) dynamic amplifier (DA) is used as the inter-stage amplifier and acts as a low-pass filter for the 1st-stage residue. To verify the proposed techniques, a 13-bit prototype ADC is built in 40nm CMOS process. Its input capacitor is only 120 fF, which is over 20 times smaller than what would be needed in a classic Nyquist ADC with the S/H circuit. Operating at 2 MS/s, it achieves 72-dB SNDR at the Nyquist rate while consuming only 25  $\mu$ W of power and 0.01 mm<sup>2</sup> of area.

# Table of Contents

| Ackno   | wledg          | ments                                                                    | $\mathbf{v}$ |
|---------|----------------|--------------------------------------------------------------------------|--------------|
| Abstra  | act            |                                                                          | ix           |
| List of | Table          | es                                                                       | xv           |
| List of | Figu           | res                                                                      | xvi          |
| Chapt   | er 1.          | Introduction                                                             | 1            |
| 1.1     | Ubiqu          | nitous Sensing                                                           | 1            |
| 1.2     | Prior          | Architectures                                                            | 3            |
| 1.3     | Overv          | view of the Proposed Techniques                                          | 6            |
| 1.4     | Orgai          | nization                                                                 | 8            |
| Chapt   | er 2.          | Review of Biomedical Sensing Front-End Circuits                          | 9            |
| 2.1     | Low-I          | Noise Amplifiers                                                         | 9            |
| 2.2     | Analo          | og-to-Digital Converters                                                 | 10           |
| Chapt   | er 3.          | Inverter Stacking Technique: First Prototypes                            | 12           |
| 3.1     | Intro          | duction                                                                  | 12           |
| 3.2     | Low-I          | Noise Amplifier Design: Challenges                                       | 15           |
| 3.3     | Prope          | osed Inverter-Stacking Techniques                                        | 20           |
| 3.4     | Propo<br>Stack | osed Inverter Stacking Amplifier:<br>-2 Version                          | 21           |
|         | 3.4.1          | Core Schematic of Proposed Inverter Stacking Amplifier                   | 21           |
|         | 3.4.2          | Small-Signal Gain, Input-Referred Noise, Offset, CMRR, and PSRR Analyses | 22           |
|         | 3.4.3          | Bias Voltage Generation                                                  | 25           |
|         | 3.4.4          | Closed-loop Configuration with Split Capacitor Feedback                  | 26           |

| 3.5        | Proposed Inverter Stacking Amplifier:<br>Stack-3 Version |                                                                          |    |
|------------|----------------------------------------------------------|--------------------------------------------------------------------------|----|
| 3.6        | Circu                                                    | it Implementation                                                        | 29 |
| 3.7        | Meas                                                     | urement Results                                                          | 30 |
| 3.8        | Comp                                                     | parison to Other LNA works                                               | 31 |
| 3.9        | Sumn                                                     | nary                                                                     | 32 |
| Chapter 4. |                                                          | Inverter Stacking Technique: Second Prototypes                           | 51 |
| 4.1        | Intro                                                    | luction                                                                  | 51 |
| 4.2        | Prope                                                    | osed Chopping Tail-less Inverter Stacking Input stage                    | 55 |
|            | 4.2.1                                                    | Concept                                                                  | 55 |
|            | 4.2.2                                                    | Biasing network                                                          | 57 |
|            | 4.2.3                                                    | Signal gain, input referred noise                                        | 58 |
|            | 4.2.4                                                    | Trade-off discussion                                                     | 60 |
| 4.3        | Three                                                    | e-stage dominant-pole compensated amplifier                              | 61 |
| 4.4        | Desig                                                    | n details and discussions                                                | 62 |
|            | 4.4.1                                                    | CMRR enhancement                                                         | 62 |
|            | 4.4.2                                                    | Offset and ripple reduction                                              | 66 |
| 4.5        | Meas                                                     | urement Results                                                          | 67 |
| 4.6        | Comp                                                     | parison to Other LNA works                                               | 68 |
| 4.7        | Concl                                                    | usion                                                                    | 69 |
| Chapter 5. |                                                          | CT-SAR-assisted kT/C noise-free Nyquist ADC De-<br>sign: Third Prototype | 84 |
| 5.1        | Intro                                                    | luction                                                                  | 84 |
| 5.2        | Low-I                                                    | Noise Amplifier Design: Challenges                                       | 86 |
| 5.3        | Low-Noise Amplifier Design: Architectures                |                                                                          | 87 |
|            | 5.3.1                                                    | Inverter-Based Architectures                                             | 87 |
|            | 5.3.2                                                    | Orthognal Current-reuse Architectures                                    | 88 |
| 5.4        | Prope                                                    | osed Two-Step ADC with 1st-Stage CT-SAR                                  | 92 |
|            | 5.4.1                                                    | Topology overview                                                        | 92 |
|            | 5.4.2                                                    | CT SAR conversion error and mitigation                                   | 94 |
|            | 5.4.3                                                    | Inter-stage amplifier operation with a time-varying $V_{res}$            | 98 |

| 5.5          | Prototype ADC Implementation           | 100 |
|--------------|----------------------------------------|-----|
| 5.6          | Noise Analysis of the Proposed ADC     | 104 |
| 5.7          | Measurement Results                    | 107 |
| 5.8          | Comparison to Other LNA works          | 108 |
| 5.9          | Summary                                | 109 |
| Chapt        | er 6. Conclusion and Future Directions | 119 |
| 6.1          | Conclusion                             | 119 |
| 6.2          | Future Directions                      | 121 |
| Appendices   |                                        |     |
| Appen        | dix A. List of Publications            | 124 |
| A.1          | Patent                                 | 124 |
| A.2          | Conference Papers                      | 124 |
| A.3          | Journal Papers                         | 126 |
| A.4          | Journal Papers Under Review            | 127 |
| A.5          | Journal Papers In preparations         | 128 |
| Appen        | dix B. My Appendix $#2$                | 129 |
| B.1          | The First Section                      | 129 |
| Bibliography |                                        |     |
| Vita         |                                        |     |

# List of Tables

| 3.1 | Devices Geometry of Stack-2 Amplifier                         | 29  |
|-----|---------------------------------------------------------------|-----|
| 3.2 | Devices Geometry of Stack-3 Amplifier                         | 30  |
| 3.3 | Performance Summary and Comparison with State-of-the-art LNAs | 46  |
| 4.1 | Performance Summary and Comparison with State-of-the-art LNAs | 81  |
| 5.1 | Noise Budgeting of the Proposed CT-SAR ADC                    | 107 |
| 5.2 | Performance Summary and Comparison with State-of-the-art ADCs | 118 |

# List of Figures

| 1.1  | Internet of Things Devices and sensor interface circuits                                                                                                                                                 | 3  |
|------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 3.1  | (a) Fully differential common source amplifier. (b) inverter-<br>based amplifier                                                                                                                         | 16 |
| 3.2  | Natural inverter stacking topology.                                                                                                                                                                      | 33 |
| 3.3  | Stack-2 inverter stacking amplifier schematic                                                                                                                                                            | 34 |
| 3.4  | Common-mode circuit analysis                                                                                                                                                                             | 35 |
| 3.5  | Replica based bias circuit.                                                                                                                                                                              | 36 |
| 3.6  | Bias branch and the CMFB circuit                                                                                                                                                                         | 37 |
| 3.7  | Block diagram of the closed-loop amplifier with split capacitor feedback                                                                                                                                 | 38 |
| 3.8  | Offset averaging model                                                                                                                                                                                   | 39 |
| 3.9  | Block diagram of the proposed closed-loop amplifier with para-<br>sitic capacitors.                                                                                                                      | 40 |
| 3.10 | Stack-3 inverter stacking amplifier schematic                                                                                                                                                            | 41 |
| 3.11 | Block diagram of stack-N inverter stacking amplifier                                                                                                                                                     | 42 |
| 3.12 | Simulated NEF across corners                                                                                                                                                                             | 43 |
| 3.13 | (a) Die photo of stack-2 and (b) stack-3                                                                                                                                                                 | 43 |
| 3.14 | Measured AC transfer of amplifier.                                                                                                                                                                       | 44 |
| 3.15 | Measured amplifier input referred noise PSD                                                                                                                                                              | 45 |
| 3.16 | Measured NEF versus temperature                                                                                                                                                                          | 47 |
| 3.17 | Measured NEF of the stack-2 amplifier versus power supply. $\ .$                                                                                                                                         | 48 |
| 3.18 | Measured NEF of the stack-3 amplifier versus power supply                                                                                                                                                | 49 |
| 3.19 | Closed-loop amplifier NEF survey                                                                                                                                                                         | 50 |
| 4.1  | Closed-loop amplifier PEF survey                                                                                                                                                                         | 52 |
| 4.2  | Tail-less inverter-stacking input stage signal path                                                                                                                                                      | 56 |
| 4.3  | (a) Replica based biasing scheme for the tail-less inverter stack-<br>ing input stage. (b) CM pre-filtering biasing loop for top am-<br>plifier. (c) CM pre-filtering biasing loop for bottom amplifier. | 70 |
|      |                                                                                                                                                                                                          |    |

| 4.4  | (a) Transfer function from NMOS outside the loop. (b) Transfer function from the PMOS in the loop.                                                                                                                                | 71 |
|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 4.5  | Block diagram of the three-stage dominant-pole compensated amplifier.                                                                                                                                                             | 72 |
| 4.6  | Schematic of the second stage                                                                                                                                                                                                     | 73 |
| 4.7  | (a) Schematic of output stage                                                                                                                                                                                                     | 74 |
| 4.8  | Block diagram of the single-ended half-circuit of the fully dif-<br>ferential closed-loop OTA for CMRR analysis.                                                                                                                  | 74 |
| 4.9  | (a) Block diagram of the single-ended half-circuit of the CM pre-filtering loop. (b) Transfer function of the CM pre-filtering loop.                                                                                              | 75 |
| 4.10 | Offset and ripple reduction analysis                                                                                                                                                                                              | 76 |
| 4.11 | Chip Microphotograph                                                                                                                                                                                                              | 77 |
| 4.12 | Measured OTA transfer functions: gain transfer function, CM-<br>DM conversion transfer function, and supply-DM transfer func-<br>tion.                                                                                            | 78 |
| 4.13 | Measured OTA input referred noise                                                                                                                                                                                                 | 79 |
| 4.14 | Measured OTA THD at different input amplitudes.                                                                                                                                                                                   | 80 |
| 4.15 | Measured performances versus temperature.                                                                                                                                                                                         | 82 |
| 4.16 | Measured performances versus supply voltage                                                                                                                                                                                       | 83 |
| 5.1  | (a) Conventional DT two-step SAR ADC; (b) Proposed CT-SAR-assisted two-step SAR ADC                                                                                                                                               | 85 |
| 5.2  | Block diagrams of CT pipeline ADC with (a) delay mismatch;<br>(b) negative delay added to the slow path; (c) positive delay<br>added to the fast path; and (d) proposed CT-SAR with short-<br>ened delay and built-in redundancy. | 89 |
| 5.3  | Architectural block diagram of (a) the proposed CT-SAR-assisted two-step SAR ADC; (b) 1-st stage CT-SAR ADC; and (c) example waveforms for key nodes.                                                                             | 91 |
| 5.4  | Input and DAC output example waveform for (a) conventional DT-SAR ADC; (b) large $E_{slope}$ for CT-SAR ADC by simply removing S/H circuits; (c) recovered residue voltage with ac-                                               |    |
|      | celerated SAR conversion; (d) recovered residue voltage with<br>built-in redundancy.                                                                                                                                              | 91 |
| 5.5  | SNR versus $T_{SAR}$ and number of conversion cycles $(N)$                                                                                                                                                                        | 96 |
| 5.6  | Comparison with SNR degradation with/without redundancy.                                                                                                                                                                          | 98 |

| 5.7  | (a) DA schematic; (b) timing diagram; (c) window function $h(t)$ ;<br>(d) DA response $H(\omega)$ ; (e) equivalent time-domain amplified signal point. | 110      |
|------|--------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| 5.8  | Proposed CT-SAR-assisted two-step ADC: (a) top-level schematic<br>(b) dynamic SAR logic; (c) timing diagram.                                           | ;<br>111 |
| 5.9  | Maximum residue voltage $\max(V_{res})$ as a function of input signal frequency and amplitude.                                                         | 112      |
| 5.10 | Proposed floating inverter based (FIB) dynamic amplifier (DA):<br>(a) schematic; (b) timing diagram and waveforms for key circuit<br>nodes.            | 113      |
| 5.11 | Proposed FIB-DA: (a) simulated $g_m(t)$ ; (b) frequency response.                                                                                      | 114      |
| 5.12 | Noise linear model for (a) conventional DT-SAR ADC; (b) proposed CT-SAR ADC.                                                                           | 114      |
| 5.13 | Chip photo                                                                                                                                             | 115      |
| 5.14 | Measured spectrum with (a) 100-kHz and (b) 950-kHz input                                                                                               | 116      |
| 5.15 | Measured SNDR and SFDR versus input frequency                                                                                                          | 116      |
| 5.16 | Measured SNDR and SFDR versus input amplitude                                                                                                          | 117      |

## Chapter 1

## Introduction

### 1.1 Ubiquitous Sensing

Recent decades have witnessed the significant advances of the internet of things (IoTs). With the developments of the CMOS integrated circuit design and manufacturing techniques, as well as the emerging of 5G technology, IoTs becomes one of the hottest topics in both academia and industry. The IoT is the network of physical devices, vehicles, home appliances, and other items embedded with electronics, software, sensors, actuators, and network connectivity which enables these objectives to connect and exchange data [1,2].

The radio connection can be Bluetooth low-energy (BLE), Wi-Fi, cellular, or any other wireless standard. The majority of today's devices are powered by a battery, either a 3-V coin-cell or a rechargeable battery. These batteries usually do not contain enough energy to supply the IoT devices for their entire lifetime, so they must be replaced or recharged on a regular basis. That might be acceptable today with only a few devices per person, but with the expected fast growth of the IoT, this will not be rational anymore in the near future. There are several techniques, including reducing the power consumption of IoT devices in orders of magnitude and using alternative power sources to deliver power.

Autonomous wireless sensor node networks have been a prevailing research topic during the past few years. A wide range of promising applications could be realized based on these networks in areas like health care, security, logistic and so on. Advances in power efficient and area efficient circuit design techniques have drawn a huge amount of interests in biomedical applications. As the interface between the biomedical sensors and digital processors, the analog front-end is one of the most critical building blocks for the complicated biomedical systems. The main challenges come from the biomedical signal properties. The signal amplitude of most the biomedical signals, including ECG, EEG, EMG, and PPG, etc., are in the range of tens of  $\mu$ V to several mV, which poses very stringent noise requirement. Therefore, low-noise and low-power analog front-ends are badly required to extract high-resolution biomedical signals.

In recent decades, the internet of things (IoT) has been sprout, which results from the improvement of the circuit design and manufacturing techniques. Moreover, the emerging of 5G technologies further enhance its growth. Autonomous wireless sensors and their networks have been one of the most prevailing and important research topics for the past decades. Although researchers have been pushing the state-of-the-art of sensor readout to have higher and higher power and area efficiency, the results turn out to be insufficient to meet the modern requirements, especially considering the number of sensors is dramatically growing and a large portion of them are battery-less devices. Thus, maintaining a high resolution and low noise while achieving a high power and area efficiency has been one of the major challenges for sensor readout circuit designs in recent years.



### **1.2** Prior Architectures

Figure 1.1: Internet of Things Devices and sensor interface circuits.

Fig. 1.1 shows the traditional block diagram for the sensor front-end system. In such a traditional configuration, an instrumentation amplifier (IA) is usually connected to the sensor to pick the signal up. Given sufficient amplification and filtering, the signal will be sent into the analog-to-digital converter, and converted into digital format. All the following digital signal processing will be performed based on the digital representation of the signals.

Signals will vary depending on the environment where the sensor is used and the type of signal it is. Usually, the raw signals from the sensor is weak and signal quality is poor. Taking the biomedical signals as examples, Table I presents the signal amplitude and their frequency band of all kinds of different biomedical signals. Most of the bio-signals are all in the range of  $\mu V$  and usually have very strong common-mode (CM) interference (e.g., strong 60-Hz AC coupling through the power grid). All these signal properties pose stringent requirements for the LNAs, and make it the bottleneck of the systems. Generally speaking, the LNA should have a large common-mode rejection ratio (CMRR) to tolerate the large CM interference accompanying the weak signal. Additionally, it should have ultra-low noise, so that the signal can be clearly picked out with the existence of the large background noise. Among all kinds of noise, 1/f noise is the most critical one, which can be the dominant noise source if not designed well. Furthermore, the property of the electrode should also be taken into consideration. Large offset due to the electrode itself and large artifacts due to either motion or other environment influences are two of the main performance killers in most biomedical applications. These two interference can be too large for the system to handle and saturate the systems, if no design techniques are applied. Last but not the least, there are also other sources of interference from the supply. Thus, a high power supply rejection ratio (PSRR) is also important for a real implementation. To simultaneously meet all the requirements, while maintaining a good power efficiency is not an trivial task. Therefore, it is important to look for a way to design a

power-efficient LNA, by making full use of the signal property.

Another important and power hungry building block is the analog-todigital converter (ADC). The ADC is an electronic integrated circuit that transforms a signal from analog to digital form. Since most real-world signals are analog, ADC is an essential components, not only in IoT sensor interfaces, but also in various of modern electronic devices to provide a bridge between the analog world of transducers and the digital world of signal processing and data handling. In different systems, the optimum choice of an ADC depends on the target applications. However, with the development of design techniques and the available CMOS technologies, successive approximation register (SAR) ADCs have been the ADC type, which achieves the best power efficiency. To achieve a good performance, a front-end S/H circuit is usually required to provide an unchanged sampled signal version for the following analog signal processing operations (e.g., quantization, subtraction, and amplification). However, the use of the front-end S/H brings an unwanted sampling noise, which poses a fundamental SNR limit for the ADC. This sampling noise is typically suppressed passively by increasing the capacitor size [3-15]. For example, the total differential sampling capacitors need to be greater than 2.1 pF to achieve a sampling noise limited SNR of 80 dB with a 2.5-V peak-to-peak differential signal swing, and it has to be quadrupled for every 1-bit increase in the resolution.

Additionally, it should be noted that an entire ADC system consists not only the ADC core itself, but also the peripheral circuits. Achieving the ultrahigh power efficiency for the core ADC itself is not sufficient to guarantee the system-level power efficiency, especially in the topic of the SAR ADC design. Both the input drivers and reference buffers are power hungry and becoming the power consumption bottleneck of the entire system. Their power consumption can be one order of magnitude higher than the ADC core. Since the power consumption of both the ADC core and the peripheral buffering circuits scale with the capacitive digital-to-analog converter (CADC), it is important to look for a way to design a high-resolution SAR ADC, with relatively small CDAC, and therefore achieve high power and area efficiency in both ADC level and system level.

### **1.3** Overview of the Proposed Techniques

This thesis proposes several novel power and area saving techniques for the fundamental building blocks: 1) the inverter-stacking technique; and 2) the tail-less inverter-stacking technique for LNA; 3) the CT-SAR-assisted two-step SAR ADC with kT/C noise attenuated.

The first work presents a highly power-efficient amplifier. By stacking inverters and splitting the capacitor feedback network, the proposed amplifier achieves 6-time current reuse, thereby significantly boosting the transconductance and lowering noise but without increasing the current consumption. A novel biasing scheme is devised to ensure robust operation under 1 V supply. A prototype in 180 nm CMOS has 5.5  $\mu$ V<sub>rms</sub> noise within 10 kHz BW while consuming only 0.25  $\mu$ W power, leading to a noise efficiency factor (NEF) of 1.07, which is the best among reported amplifiers.

The second work presents a low-noise capacitively-coupled instrumentation amplifier, featuring the better-than-bipolar power efficiency. The tailless structure removes the tail current source, reducing the supply voltage to be 0.6 V, and thus significantly reducing the power consumption. Compared with other recently reported front-end amplifier, it achieves the best trade-off between power consumption and input referred noise. AC-coupling and current mode biased are employed to enhance its PVT robustness. In addition, several other design techniques are used, including AC coupling with optimized gain allocation-based ripple reduction, CM-pre-filtering based CMRR enhancement. The prototype fabricated in 180-nm CMOS process achieved an integrated input referred rms noise of 1.38  $\mu V_{rms}$  within an 8-kHz bandwidth. With one global 0.6-V supply voltage, the prototype consumes 2.7- $\mu W$ of total power, leading to a PEF of 0.96. The peak CMRR and PSRR are measured to be 84 dB and 78 dB, respectively, which validates the performance enhancement techniques with the pseudo-differential input stage.

The third work presents a two-step analog-to-digital converter (ADC) that operates its 1st-stage successive approximation register (SAR) ADC in the continuous time (CT) domain. It avoids the front-end sample-and-hold (S/H) circuit and its associated sampling noise. Hence, the proposed ADC allows the input capacitor size to be substantially reduced without incurring large sampling noise penalty. With input AC coupling, the 1st-stage CT SAR can simultaneously perform input tracking and SAR quantization. Its conversion

error is minimized by accelerating the SAR speed and providing redundancy. A floating inverter-based (FIB) dynamic amplifier (DA) is used as the inter-stage amplifier and acts as a low-pass filter for the 1st-stage residue. To verify the proposed techniques, a 13-bit prototype ADC is built in 40nm CMOS process. Its input capacitor is only 120 fF, which is over 20 times smaller than what would be needed in a classic Nyquist ADC with the S/H circuit. Operating at 2 MS/s, it achieves 72-dB SNDR at the Nyquist rate while consuming only 25  $\mu$ W of power and 0.01 mm<sup>2</sup> of area.

### **1.4** Organization

Three prototypes were taped-out to validate the proposed techniques. Chapter 2 introduces the basic design challenges of low-noise amplifier. It also describes the very first the inverter-stacking amplifier technique and the prototype in 180-nm CMOS process. Chapter 3 presents a more advanced the tail-less inverter-stacking amplifier prototype in 180-nm CMOS process. This work is based on the original design principle, but enables the low-supply voltage operation capability. Chapter 4 introduces the kT/C noise-free ADC architecture, and its implementations in 40-nm CMOS process to attain a power and area efficient ADC.

## Chapter 2

# Review of Biomedical Sensing Front-End Circuits

#### 2.1 Low-Noise Amplifiers

The overall noise of a sensor read-out circuit is typically dominated by the front-end amplifier. For a given amplifier topology, there exists a fundamental trade-off between noise and power. Thus, to suppress the noise below a certain target, it is necessary to consume a sufficiently large amount of power. In addition, the amplifier power does not decrease with technology scaling, as it is noise limited rather than technology limited. As a result, for low-noise sensor applications, the front-end amplifier usually takes up a significant portion of the overall system power budget [16], [17], [18], [19]. Therefore, it is highly desirable to develop design techniques that can relax this tight noise and power tradeoff. Reducing amplifier power while keeping the same noise level is crucial for a wide range of power and energy constrained applications. For example, in the Internet-of-Things (IoT) era, to ensure a long lifetime without battery replacement, the power of the amplifier in the sensor node needs to be ultralow [20], [21]. Similarly, biomedical implants have a stringent requirement on the amplifier power due to limited battery size as well as safety concerns regarding heat dissipation [22], [23], [24], [25]. A more power-efficient front-end amplifier is always desired.

### 2.2 Analog-to-Digital Converters

As another power hungry building block, reducing the power consumption of ADCs is also important and necessary. The stringent power budget, and impulse mode operation capability make the successive approximation register (SAR) ADC a perfect candidate. Due to its highly digital operation, switching-intensive nature, the SAR ADC is a very scaling friendly architecture, which favors scaled technology that provides faster device speed, as well less parasitic, and thus be pushed to high resolution applications (>13bit). State-of-the-art high resolution SAR ADCs can have very good powerefficiency (<20 fJ/conv-step). The main reason of performance limitation is the large CDAC. There are two aspects that prevent the CDAC shrinking. a) mismatch between capacitor units. b) fundamental kT/C sampling noise. There are lots of techniques dealing with the mismatch in high-resolution SAR ADCs, including foreground calibration, split ADC calibration, averaging in time. The former issue is solvable with all these proposed talented techniques, nevertheless the later is usually the fundamental lower boundary of the DAC size. Traditionally, larger sampling capacitor is used to suppress thermal noise. The price of 4X capacitor size needs to pay, for one extra bit, which increases the power consumption from both input driver and reference buffer. As an example, for a conventional 80-dB SNR, a 4-pF sampling capacitor is required, which trades power and area to meet noise specification. The target here is to design a more area-/power-efficient SAR ADC.

## Chapter 3

## **Inverter Stacking Technique: First Prototypes**

### 3.1 Introduction

<sup>1</sup> The overall noise of a sensor read-out circuit is typically dominated by the front-end amplifier. For a given amplifier topology, there exists a fundamental trade-off between noise and power. Thus, to suppress the noise below a certain target, it is necessary to consume a sufficiently large amount of power. In addition, the amplifier power does not decrease with technology scaling, as it is noise limited rather than technology limited. As a result, for low-noise sensor applications, the front-end amplifier usually takes up a significant portion of the overall system power budget [16–19, 26]. Therefore, it is highly desirable to develop design techniques that can relax this tight noise and power tradeoff. Reducing amplifier power while keeping the same noise level is crucial for a wide range of power and energy constrained applications. For example, in the Internet-of-Things (IoT) era, to ensure a long lifetime without battery replacement, the power of the amplifier in the sensor node needs to be ultralow [20, 21]. Similarly, biomedical implants have a stringent

<sup>&</sup>lt;sup>1</sup>This work was first presented in VLSI symposium 2017 [35]. This work was done by the first author Linxiao Shen, and all the technical discussions with the co-authors are highly appreciated.

requirement on the amplifier power due to limited battery size as well as safety concerns regarding heat dissipation [22–24,27].

There have been many excellent research works in the past that aim to mitigate the amplifier noise-power tradeoff [20, 21, 28-34]. Essentially, the goal is to decrease the product of power and noise for an amplifier. Thus, for the same noise, the amplifier power can be reduced; or for the same power, the amplifier noise can be minimized. These two scenarios are directly interchangeable. The central idea is to boost the overall amplifier transconductance  $g_m$  but without increasing the bias current  $I_D$ . The classic design technique is to bias the input transistors in weak inversion to maximize their  $g_m/I_D$  [34]. To further increase  $g_m$ , a PMOS input pair can be stacked on top of an NMOS input pair to form an inverter based input stage, so that the overall amplifier  $g_m$  is doubled but without requiring any extra bias current (it is shared by both NMOS and PMOS input pairs) [29, 33]. The challenge for the scheme of [29] is that it requires multiple power supplies and has to deal with the common-mode rejection ratio (CMRR) and power supply rejection ratio (PSRR) degradation due to its pseudo-differential input pair. The work of [33] not only achieves 2-time current reuse, but also operates the amplifier first-stage under a low voltage of 0.2 V, thereby further reducing the amplifier power; however, it also needs extra DC-DC converters for multiple power supplies, which increase the hardware complexity and incur additional area and power costs. The orthogonal current reuse technique of [31] boosts the level of current reuse, allowing N-time current reuse among N-channel inputs, but it has 2N number of output branches to combine, leading to increased complexity and power of the peripheral circuits, and thus, loss in the overall amplifier power efficiency. In addition, it can only be applied for applications with multi-channel inputs like neural recording. Recently, by using AC coupling and multi-chopper, the work of [32] realizes 6-time current reuse for a single-channel input. It also reduces the number of current summing branches to  $2^{N/2}$  by using both NMOS and PMOS pairs, but it still does not eliminate the exponential dependence. It obtained the previously best measured noise-power tradeoff, but this is achieved in open loop. When placing this amplifier in a practical closed-loop configuration to ensure an accurate gain and high linearity, its power efficiency would inevitably degrade due to intrinsically increased input referred noise, especially considering the parasitic capacitance at the virtual ground nodes. Moreover, it needs complicated demodulation and a 4th-order filter to attenuate the ripple, which increases the overall complexity and requires additional power and area.

This chapter presents a novel power efficient amplifier. By vertically stacking N inverters, it achieves 2N-time current reuse for a single-channel input. Unlike [31] and [32], it has only N output current branches to combine, thus, turning the prior exponential dependence into a mild linear dependence. As a result, it reduces the power of the peripheral circuits and boosts the over-all amplifier power efficiency. The proposed amplifier fits well in a closed-loop capacitive feedback configuration. The required AC coupling to the multiple amplifier input nodes can be realized by splitting the input and feedback capacitors into multiple paths. As a result, it does not require any additional

hardware which would incur extra cost in chip power and area. To minimize the requirement on the power supply voltage, the tail current sources between stacked inverters are eliminated but without sacrificing CMRR and PSRR. A replica circuit ensures that input pairs with tight coupling are robustly biased against process, voltage, and temperature (PVT) variations. Two prototype amplifiers are implemented in 180 nm CMOS process [35]. The stack-2 version achieves a measured noise efficiency factor (NEF) of 1.26 in closed-loop under the supply voltage of 0.9 V. The stack-3 version achieves an NEF of 1.07 under the supply voltage of 1 V. To the best of the authors' knowledge, this NEF is the best among all measured amplifiers so far. The second best NEF achieved by a closed-loop amplifier is 1.64, which translates to over 2.3-time more power consumption compared to the proposed amplifier assuming the same noise performance.

This chapter is organized as follows. Section II reviews classic lowpower design techniques and the core concept of current reuse. Section III presents the proposed inverter stacking amplifier focusing on the stack-2 configuration. Section IV presents the stack-3 version. Section V shows the detailed circuit implementation. Measurement results are shown in Section VI. The conclusion is in Section VII.

### 3.2 Low-Noise Amplifier Design: Challenges

Fig. 3.1(a) shows the schematic of a basic fully-differential commonsource amplifier. Its input referred thermal noise power spectral density (PSD) can be calculated as:

$$N_{PSD} = \frac{8kT\gamma}{g_{m1}} (1 + \frac{g_{m2}}{g_{m1}})$$
(3.1)

where k is the Boltzmann constant,  $\gamma$  is the noise model parameter. The power consumption is given by:

$$P = V_{DD} \cdot I_{tot} \tag{3.2}$$

where  $V_{DD}$  denotes the supply voltage,  $I_{tot}$  denotes the total current consumption. Thus, its power and noise product is given by:

$$P \cdot N_{PSD} = 8kT\gamma \cdot V_{DD} \cdot \frac{I_{tot}}{g_{m1}} \left(1 + \frac{g_{m2}}{g_{m1}}\right)$$
(3.3)



Figure 3.1: (a) Fully differential common source amplifier. (b) inverter-based amplifier.

To achieve a higher power efficiency and minimize this power-noise product, classic design techniques include: 1) bias the input transistors in weak inversion to maximize  $g_m/I_D$  [34]; 2) bias the load transistors in strong inversion to decrease its  $g_m/I_D$  and thus reduce  $g_{m2}/g_{m1}$  [36]. Sometimes if  $V_{DD}$  is tunable, people also try to lower it as much as possible to reduce power [33,37], but there is usually restriction due to signal swing requirement and system level consideration.

To characterize the power or noise efficiency of an amplifier, researchers have come up with a figure-of-merit called noise efficiency factor (NEF), which is given by [38]:

$$NEF = v_{ni,rms} \sqrt{\frac{2}{\pi} \cdot \frac{I_{tot}}{V_T \cdot 4kT \cdot BW}}$$
(3.4)

where  $v_{ni,rms}$  is the input referred rms noise of the amplifier in a given bandwidth BW, and  $V_T$  is the thermal voltage given by kT/q. Differently from (4.3), the NEF of (4.4) is defined as a unitless ratio and easy to compare. It essentially normalizes the power and noise product of a given amplifier against that of a single bipolar transistor. NEF is usually greater than 1 for a typical MOSFET amplifier because: 1)  $g_m/I_D$  of a MOSFET is smaller than that of a bipolar transistor; 2) MOSFET produces much larger 1/f noise; 3) a practical amplifier always has other devices that contribute noise and consume power. Assuming the amplifier noise is dominated by thermal noise, the NEF of any differential amplifier can be simplified to:

$$NEF = \sqrt{4\gamma \cdot \alpha \cdot \eta \cdot \frac{q/kT}{m \cdot g_m/I_D}} \approx \sqrt{\frac{4\gamma \cdot \alpha \cdot \eta \cdot n}{m}}$$
(3.5)

where  $\alpha$  is the noise excess factor defined as the total amplifier noise normalized against the noise from the input transistors (if  $\alpha = 1$ , noise from all other devices is ignored),  $\eta$  is the current excess factor defined as the total amplifier current divided by the current of the input transistor (if  $\eta = 1$ , all bias current goes through the input pair), and m is the current reuse times (m = 1 for a fully differential common-source amplifier of Fig. 3.1(a)). In simplifying (4.5), we also assume input transistors are biased in the subthreshold region where  $\frac{q/kT}{g_m/I_D}$  is equal to the subthreshold slope factor n. As a result, the theoretical lower bound of the NEF for the amplifier of Fig. 3.1(a) is about 2 assuming  $\gamma = 0.7$ ,  $\alpha = \eta = 1$ , and n = 1.4. For a realistic amplifier assuming the input pair consumes 80% of the total current ( $\eta = 1.25$ ) and contributes 80% of the total noise ( $\alpha = 1.25$ ), the practical lower bound of NEF is about 2.5.

To improve the amplifier power efficiency and minimize NEF, the key idea is to boost  $g_m$  but without increasing the amplifier current. An effective way is through current reuse. Fig. 3.1(b) shows an inverter-based amplifier that reuses its bias current for both NMOS and PMOS input pairs. Assuming both pairs have the same transconductance, the overall amplifier  $g_m$  is doubled for the same bias current, leading to 2-time reduction in noise power and 1.4-time reduction in NEF. One tradeoff of using an inverter based amplifier is reduced output signal swing, but this can be alleviated by adding a second stage amplifier that follows it [32]. Other tradeoff include: 1) increased requirement on the power supply voltage; 2) increased input capacitance; and 3) reduced input common-mode range. However, for applications that care most
about power efficiency, it is worthwhile to pay the price of these.

If we can achieve more times of current reuse (i.e., increasing m), then  $g_m$  can be further boosted and NEF can be further reduced. To calculate the practical limit, we assume  $\alpha = \eta = 1.25$ . 4-time and 6-time current reuse would reduce the practical NEF lower bound to 1.25 and 1, respectively, indicating significant power reduction.

The direct way of achieving more times of current reuse is to vertically stack inverter based amplifiers as shown in Fig. 3.2. This way, the bias current is reused 4 times and thus boosting  $g_m$  by 4 times. Nonetheless, directly stacking inverters brings several challenges. First, the required minimum power supply voltage, given by  $4|V_{gs}| + 4|V_{ds}|$ , is larger than a single inverter based amplifier of Fig. 1(b). Typically, the minimum required  $|V_{ds}|$ for a transistor to have a reasonably large output impedance is 100 mV. For a transistor with  $|V_{th}|$  of 400 mV, even if it is biased in the deep subthreshold region with an overdrive voltage of -100 mV, the corresponding  $|V_{gs}|$  is 300 mV, leading to a minimum power supply voltage of 1.6 V. One way to reduce the supply voltage is to use a native transistor with low  $|V_{th}|$  (e.g., 100 mV). However, this comes with a price. As shown in Fig. 2, it is easy to derive that  $|V_{gs1}| + |V_{gs2}| = |V_{ds1}| + |V_{ds2}| \ge 200$  mV. Thus, each  $|V_{gs}|$  is greater than 100 mV. This means the overdrive voltage is greater than 0 mV, leading to a limited current efficiency  $(g_m/I_D)$ . Second, there is more than one input node, and thus, we need a method to couple the amplifier input to all input pairs. This method is preferred to be simple, low-noise, and low-cost. Third, there are multiple output nodes, which also calls for a low-cost way to aggregate all small-signal currents. In addition, when addressing these challenges, we cannot sacrifice CMRR, PSRR, as well as PVT robustness.

#### 3.3 Proposed Inverter-Stacking Techniques

This paper presents a novel power efficient amplifier. By vertically stacking N inverters, it achieves 2N-time current reuse for a single-channel input. Unlike [31] and [32], it has only N output current branches to combine, thus, turning the prior exponential dependence into a mild linear dependence. As a result, it reduces the power of the peripheral circuits and boosts the overall amplifier power efficiency. The proposed amplifier fits well in a closed-loop capacitive feedback configuration. The required AC coupling to the multiple amplifier input nodes can be realized by splitting the input and feedback capacitors into multiple paths. As a result, it does not require any additional hardware which would incur extra cost in chip power and area. To minimize the requirement on the power supply voltage, the tail current sources between stacked inverters are eliminated but without sacrificing CMRR and PSRR. A replica circuit ensures that input pairs with tight coupling are robustly biased against process, voltage, and temperature (PVT) variations. Two prototype amplifiers are implemented in 180 nm CMOS process [35], [39]. The stack-2 version achieves a measured noise efficiency factor (NEF) of 1.26 in closed-loop under the supply voltage of 0.9 V. The stack-3 version achieves an NEF of 1.07 under the supply voltage of 1 V. To the best of the authors' knowledge, this NEF is the best among all measured amplifiers so far. The second best NEF achieved by a closed-loop amplifier is 1.64, which translates to over 2.3-time more power consumption compared to the proposed amplifier assuming the same noise performance.

## 3.4 Proposed Inverter Stacking Amplifier: Stack-2 Version

#### 3.4.1 Core Schematic of Proposed Inverter Stacking Amplifier

The core schematic of the proposed fully-differential stack-2 inverter stacking amplifier is shown in Fig. 3.3. There are several changes when comparing to the natural stacking topology. First, the 4 input pairs are separated. As will be shown later, the input signal is AC coupled to all 4 input nodes. This way, the input pairs can be biased at different voltage levels. Second, the two current source transistors between the stacked inverters are removed. These modifications allow a significant reduction of the minimally required power supply voltage to  $6V_{ov}$ . Third, four common-gate transistors  $(M_{5a}/M_{5b})$ and  $M_{6a}/M_{6b}$ ) are added to aggregate the small-signal currents from all input pairs.

#### 3.4.2 Small-Signal Gain, Input-Referred Noise, Offset, CMRR, and PSRR Analyses

Let us first analyze the small-signal behavior of the proposed amplifier. It is simple to derive that the total amplifier transconductance  $g_{mt}$  is:

$$g_{mt} = \frac{g_m}{I_D} \cdot \left(2I_D + 2I_D \cdot 0.9\right) \approx 4g_m \tag{3.6}$$

where  $I_D$  is the bias current for the input transistor. In deriving (3.6), we assume all input transistors have similar  $g_m/I_D$  and the intrinsic gain  $(g_m r_o)$ for all transistors is much greater than 1. The same assumption is made for all later derivations. For  $M_2/M_3$ , their currents are 90% of the amplifier bias current. The remaining 10% current flows through the cascode transistors  $(M_5/M_6)$ .

In differential mode (DM) operation, the node  $V_{mid}$  serves as the virtual ground, and thus the amplifier output impedance  $r_{ot}$  is given by:

$$r_{ot} = \left(g_{m5}r_{o5}(r_{o1}//r_{o2})\right) / / \left(g_{m6}r_{o6}(r_{o3}//r_{o4})\right)$$
(3.7)

where  $g_{mi}$  is the transconductance of transistor  $M_i$ , and  $r_{oi}$  is the small-signal output resistance of transistor  $M_i$ . Thus, we can derive the amplifier open-loop DM gain  $A_{DM}$ :

$$A_{DM} \equiv g_{mt} r_{ot} \approx g_m^2 r_o^2 \tag{3.8}$$

This shows its DM gain is the square of the transistor intrinsic gain, which is comparable to that of a telescopic or folded cascode amplifier. In fact, the proposed amplifier can be viewed as a hybridization of a telescopic amplifier and a folded cascode amplifier. If we only consider the lower NMOS input pair  $(M_{1a}/M_{1b})$  and assume all other input pairs are connected to DC biases, then its overall structure behaves the same as a telescopic amplifier. By contrast, if we only look at the lower PMOS input pair  $(M_{2a}/M_{2b})$ , its input and output relationship is identical to that of a folded cascode amplifier. The same analogy applies for the upper NMOS and PMOS pairs  $(M_{3a}/M_{3b})$ .

Assuming the input transistors dominate the overall amplifier noise, the overall input referred thermal noise can be derived as:

$$N_{PSD,th} = \frac{8kT\gamma \cdot (g_{m1} + g_{m2} + g_{m3} + g_{m4})}{(g_{m1} + g_{m2} + g_{m3} + g_{m4})^2} \approx \frac{2kT\gamma}{g_m}$$
(3.9)

It is clear that the noise PSD is reduced by 4 times due to  $g_m$  increase.

The input referred 1/f noise PSD of the proposed amplifier can be derived as:

$$N_{PSD,1/f} = \frac{K_f}{C_{ox} \cdot 4WL} \cdot \frac{1}{f}$$
(3.10)

where  $K_f$  is a process-dependent parameter, W and L are the transistor width and length, respectively. In the proposed amplifier, the 1/f noise is suppressed by increasing the input transistor size, so that the in-band noise is dominated by the thermal noise.

The overall input referred offset  $V_{os,in}$  can be derived as:

$$V_{os,in} = \frac{\sum_{i=1}^{4} g_{mi} V_{osi}}{g_{m1} + g_{m2} + g_{m3} + g_{m4}} \approx \frac{\sum_{i=1}^{4} V_{osi}}{4}$$
(3.11)

Assuming the offset voltages  $V_{osi}$  all have the same distribution with the standard deviation of  $\sigma_{os}$ , then the overall input referred offset standard deviation  $\sigma_{os,in}$  can be derived as:

$$\sigma_{os,in} = \frac{\sigma_{os}}{2} \tag{3.12}$$

This reduction in the input referred offset results from the increased total input transistor size.

For CMRR calculation, we apply a common-mode (CM) input and derive the DM output in the presence of mismatch, as shown in Fig. 3.4. Based on definition [40], the CMRR can be calculated as:

$$CMRR \equiv \frac{A_{DM}}{A_{CM-DM}} \approx \frac{2g_m r_o}{\frac{\Delta V_{th}}{nkT/q}}$$
(3.13)

where  $A_{CM-DM}$  denotes the CM-to-DM gain. To simplify (3.13), we have assume all transistors have the same  $g_m$  and  $r_o$ , and the threshold voltage mismatch  $\Delta V_{th}$  between different input pairs has the same distribution. The result of (3.13) is comparable to that of a telescopic amplifier, indicating that stacking inverters but without current source isolation of the natural stacking topology does not degrade CMRR. This result may seem counter intuitive. The node  $V_{mid}$  is a low-impedance node with all source connections. At first glance, this would lead to large CM voltage gains for the upper NMOS ( $M_{3a}/M_{3b}$ ) and lower PMOS ( $M_{2a}/M_{2b}$ ) input pairs. However, a careful examination shows  $V_{mid}$  tracks the input CM voltage variation. Thus, from CM analysis point of view,  $V_{mid}$  is effectively AC short with the CM input, creating a large output resistance (either looking up or looking down at  $V_{mid}$ ) that degenerates both middle input pairs. Similarly, PSRR can be derived in the same way as CMRR, and the result is given by:

$$PSRR \equiv \frac{A_{DM}}{A_{V_{DD}-DM}} \approx \frac{2g_m r_o}{\frac{\Delta V_{th}}{nkT/q}}$$
(3.14)

where  $A_{V_{DD}-DM}$  denotes the voltage gain from  $V_{DD}$  to the differential output. This result is also comparable to that of a telescopic amplifier.

#### 3.4.3 Bias Voltage Generation

To minimize the supply voltage requirement, the DC bias voltages for all 4 input pairs in Fig. 3.3 are different. The lower NMOS  $(M_{1a}/M_{1b})$  and upper PMOS  $(M_{4a}/M_{4b})$  pairs are relatively simple to bias, as a small deviation from ideal bias voltage has minimal influence on the overall amplifier operation. Nevertheless, with the removal of middle current sources  $(M_6 \text{ and } M_7)$  in the natural stacking topology, the bias voltages for the lower PMOS  $(M_{2a}/M_{2b})$ and upper NMOS  $(M_{3a}/M_{3b})$  need to be generated very carefully, because their difference directly sets  $(|V_{gs2}| + |V_{gs3}|)$  and the bias current. In order to ensure the PVT robustness, a replica based bias circuit is developed as shown in Fig. 3.5. It ensures  $M_2$ - $M_3$  are all biased at the target current level with the right gate voltages. A negative feedback loop also ensures that  $V_{mid}$ stays at the intended voltage  $V_{ref} = V_{DD}/2$ . The replica-based bias branch and CMFB circuit are given in Fig. 3.6.  $M_{10}$  is used to bias the PMOS  $(M_{2a} \text{ and } M_{2b})$  in the bottom inverter, and  $M_{11}$  is used to bias the NMOS  $(M_{3a} \text{ and } M_{3b})$  on the top inverter.  $M_5$  and  $M_6$  are cascode transistors, which provide low impedance nodes at the drain side of  $M_1/M_2$  and  $M_3/M_4$ . Thus, the mismatch-induced current difference between the bias  $(M_{10} \text{ and } M_{11})$  and input transistors  $(M_2 \text{ and } M_3)$  will flow into the common gate transistors. 10% of the amplifier bias current is allocated to the common gate (cascode) transistors  $(M_{5a}/M_{5b} \text{ and } M_{6a}/M_{6b})$ . Resistor dividers are used to generate voltages to bias the cascode transistors. The bias voltages are copied to the main amplifier using pseudo-resistors that achieve high resistance with a small chip area [24]. Common-mode feedback is also implemented with a pseudoresistor based voltage averager.

#### 3.4.4 Closed-loop Configuration with Split Capacitor Feedback

Fig. 3.7 shows the block diagram of the capacitive-feedback amplifier using the proposed inverter stacking amplifier of Fig. 3.3. The required AC coupling can be realized by splitting the input and feedback capacitors into 4 pieces [29]. Although there are multiple feedback paths, the overall behavior of this amplifier is the same as a classic capacitive-feedback amplifier whose closed-loop gain  $A_{cl}$  is set by the capacitor ratio:

$$A_{cl} \approx \frac{C_S}{C_F} \tag{3.15}$$

A feedback pseudo-resistor  $R_F$  connects the output with the input pair  $M_{1a}/M_{1b}$ . This DC feedback greatly reduces the output referred offset, which would saturate the amplifier if not addressed. Note that although this feedback is formed at only one input pair, it addresses the offsets from all input pairs. Using the model of Fig. 3.8, we can derive the overall output referred offset  $V_{os,out}$  in (3.16), and its standard deviation  $\sigma_{os,out}$  in (3.17):

$$V_{os,out} = \frac{\sum_{i=1}^{4} g_{mi} V_{osi}}{g_{m1}}$$
(3.16)

$$\sigma_{os,out} = 2 \cdot \sigma_{os} \tag{3.17}$$

Comparing with (3.12), (3.17) shows that the closed-loop output referred offset  $\sigma_{os,out}$  is only four times of the open-loop input referred offset  $\sigma_{os,in}$ . This shows that the amplifier output would not be saturated by the offset. In such a multi-input closed-loop amplifier configuration, one path of DC feedback loop is sufficient to prevent the output from saturation. An additional benefit of having this resistor feedback is that it removes the need to generate a separate DC bias for  $M_{1a}/M_{1b}$ .

To analyze the total noise, let us examine the block diagram shown in Fig. 3.9, where  $C_{PS}$ ,  $C_{PF}$ , and  $C_{POTA}$  represent the parasitic capacitance of  $C_S$ ,  $C_F$ , and the OTA input capacitance, respectively. We can derive the overall input referred noise PSD of the closed-loop amplifier:

$$N_{PSD} \approx \frac{2kT\gamma}{g_m} \cdot (1 + \frac{1}{|A_{cl}|})^2 (1 + \frac{C_P}{C_S + C_F})^2$$
(3.18)

where  $C_P = C_{PS} + C_{PF} + C_{POTA}$ . Comparing (3.18) and (3.9), it is clear that the input referred noise naturally degrades going from open-loop to closedloop, which is a common phenomenon in any closed-loop amplifier [41]. To minimize the degradation, it is preferred to enlarge the closed-loop gain  $A_{cl}$  and minimize the parasitic capacitance  $C_P$ . For example, for a closed-loop amplifier with gain of 20 and 20% parasitic capacitance, the PSD is increased by 50%, leading to an enlarged NEF by 23%.

## 3.5 Proposed Inverter Stacking Amplifier: Stack-3 Version

The proposed amplifier can be generalized to the stack-3 version as shown in Fig. 3.10. Three inverter-based input stages are stacked vertically and share the same bias current. Three common-gate branches are used to aggregate the signal-signal current from all 6 input pairs. Therefore, the overall transconductance of this amplifier is  $6g_m$ . This topology can be further expanded out to stack-N version. The general approach of embedding the proposed inverter stacking amplifier inside a capacitive feedback loop is shown in Fig. 3.11. Similar to Fig. 3.7, capacitors  $C_S$  and  $C_F$  are split up and reused as AC coupling capacitors. For the open-loop and closed-loop small-signal gain, noise, offset, CMRR, and PSRR, they can be derived following the same method as explained in Section 4.3. For brevity, we leave them out here.

Compared to the stack-2 version, the merit of the stack-3 version is increased  $g_m$ , leading to a better NEF. However, the price is increased requirement on the power supply voltage (8| $V_{ds}$ |) and reduced output signal swing. The complexity of the bias and output current summation circuits also increase, leading to increased power consumption of the peripheral circuits. Thus, the benefit of stacking more layers of inverters diminishes as the number of layers increases. For practical applications, the optimum stacking number is likely to be either 2 or 3.

#### 3.6 Circuit Implementation

Both the proposed stack-2 and stack-3 amplifiers are implemented in 180 nm CMOS process. The intended application is action potential recording with the signal bandwidth from 250 Hz to 10 kHz and the signal amplitude up to 1 mV [30]. The dimension of the transistors in the amplifier cores are summarized in Table I and Table II. Large transistor widths and lengths are chosen to boost the transistor intrinsic gain, reduce the offset, as well as suppress the 1/f noise corner to be below 250 Hz. All transistors operate in subthreshold region to boost the current efficiency  $g_m/I_D$ . The current of the amplifier core is 220 nA. The common-gate transistors used for current summation are biased at 20 nA. The output common-mode is set to  $V_{DD}/2$ using the classic resistor-averaging common-mode feedback circuit.

| Device          | $W/L(\mu m)$ | Device          | $W/L(\mu m)$ |
|-----------------|--------------|-----------------|--------------|
| $M_{1a}/M_{1b}$ | 23/5         | $M_{5a}/M_{5b}$ | 1.9/4        |
| $M_{2a}/M_{2b}$ | 11/4         | $M_{6a}/M_{6b}$ | 0.6/4        |
| $M_{3a}/M_{3b}$ | 40/4         | $M_7$           | 14/4         |
| $M_{4a}/M_{4b}$ | 10/4         | $M_8$           | 10/4         |

Table 3.1: Devices Geometry of Stack-2 Amplifier.

The open-loop gain of the amplifier is designed to be 76 dB.  $C_S$  and  $C_F$  are chosen to be 8 pF and 400 fF, respectively, leading to the nominal closed-loop gain of 26 dB. The SPICE simulated closed-loop NEF for the stack-2 and

| Device          | $W/L(\mu m)$ | Device             | $W/L(\mu m)$ |
|-----------------|--------------|--------------------|--------------|
| $M_{1a}/M_{1b}$ | 12/5         | $M_{6a}/M_{6b}$    | 10/4         |
| $M_{2a}/M_{2b}$ | 12/4         | $M_{7a}/M_{7b}$    | 6/0.8        |
| $M_{3a}/M_{3b}$ | 36/4         | $M_{8a}/M_{8b}$    | 1/4          |
| $M_{4a}/M_{4b}$ | 11/5         | $M_{9a}/M_{9b}$    | 2/4          |
| $M_{5a}/M_{5b}$ | 50/4         | $M_{10a}/M_{105b}$ | 1/4          |

Table 3.2: Devices Geometry of Stack-3 Amplifier.

stack-3 versions are 1.26 and 1.07, respectively. Fig. 3.12 shows the simulated NEF across different process corners. This consistent result is enabled by the robust replica-based bias circuit of Fig. 3.5.

The capacitors  $C_S$  and  $C_F$  are implemented using MoM capacitors. As shown in Fig. 3.9 and analyzed in (3.18), their parasitic capacitances facing the amplifier virtual ground node degrade the noise performance. To minimize this parasitic capacitance, a poly-silicon layer is inserted below the MoM capacitor. It connects with the capacitor side that faces away from the virtual ground. Although the parasitic capacitance from this plate to the substrate increases, it significantly reduces the parasitic capacitance of the other virtual-ground connecting plate by isolating it from the substrate. Parasitic extraction results show that this layout technique reduces the parasitic capacitance by 50% and improves the NEF by 17%.

#### 3.7 Measurement Results

The die photos of the prototype amplifiers are shown Fig. 3.13. The amplifier core areas for stack-2 and stack-3 versions are 0.01 and 0.02  $\text{mm}^2$ ,

respectively. The total closed-loop amplifier areas for stack-2 and stack-3 versions are 0.22 and 0.29 mm<sup>2</sup>, respectively, mainly dominated by capacitors.

The supply voltage for the stack-2 and stack-3 versions are 0.9 and 1.0 V, respectively. Their measured power consumptions are 226 and 246 nW, respectively. Their measured frequency responses are shown in Fig. 3.14. The flat-band gains are 25.4 and 25.6 dB, respectively over the frequency range of 4 Hz to 10 kHz. Fig. 3.15 plots the measured input referred noise. The 1/f noise corner is about 300 Hz. The total integrated rms input referred noise over the signal bandwidth (250 Hz to 10 kHz) is 6.7  $\mu$ V and 5.6  $\mu$ V, respectively. These results translate to the NEF of 1.26 and 1.07 for the stack-2 and stack-3 versions. The measured closed-loop CMRR and PSRR are 82 dB and 81 dB for the stack-2 version, and 84 dB and 76 dB for the stack-3 version.

#### 3.8 Comparison to Other LNA works

Fig. 3.16 shows the measured NEF over the temperature range from 0 to 60°C for both stack-2 and stack-3 prototypes. The NEF variations are within 15%. Figs. 3.17 and 3.18 show the measured NEF at different supply voltages for stack-2 and stack-3 versions, respectively. Both achieve consistent NEF results. For the stack-2 version, the measured NEF maintains at 1.26 for supply voltage beyond 0.9 V; while for the stack-3 version, the NEF maintains at 1.07 for supply voltage beyond 1 V.

Table III summarizes the performance of the prototype amplifiers and compares them with latest closed-loop amplifier works with comparable specifications. To emphasize the power efficiency of the proposed amplifier, Fig. 3.19 plots the reported measured NEF results of latest amplifiers. The dotted lines indicate the equivalent NEF value. It can be seen that the proposed work establishes a new tradeoff between the noise and power, and pushes the NEF boundary to a new level.

### 3.9 Summary

This chapter presented a novel power-efficient inverter-stacking amplifier. It achieves 6-time current reuse under 1 V supply and obtains the best NEF among all reported amplifiers to authors' best knowledge. By splitting feedback capacitors, the required input AC coupling is realized without extra hardware cost. A simple replica-based biasing circuit is devised that ensures the robust operation across PVT variations. It is well suited to be used as the front-end amplifier for various applications that have stringent power or energy requirement, such as biomedical implants and wireless sensors.



Figure 3.2: Natural inverter stacking topology.



Figure 3.3: Stack-2 inverter stacking amplifier schematic.



Figure 3.4: Common-mode circuit analysis.



Figure 3.5: Replica based bias circuit.



Figure 3.6: Bias branch and the CMFB circuit.



Figure 3.7: Block diagram of the closed-loop amplifier with split capacitor feedback.



Figure 3.8: Offset averaging model.



Figure 3.9: Block diagram of the proposed closed-loop amplifier with parasitic capacitors.





Figure 3.11: Block diagram of stack-N inverter stacking amplifier.



Figure 3.12: Simulated NEF across corners.





Figure 3.13: (a) Die photo of stack-2 and (b) stack-3.



Figure 3.14: Measured AC transfer of amplifier.



Figure 3.15: Measured amplifier input referred noise PSD.

|      | IRN (uVrms) 6.7 | BW (Hz) 10k | Power (uW) 0.23 | VDD (V) 0.9 | PSRR (dB) 81 | CMRR (dB) 82 | Gain (dB) 25.4 | Technology (nm) 180 | Stack-2 St | This wo |
|------|-----------------|-------------|-----------------|-------------|--------------|--------------|----------------|---------------------|------------|---------|
| 1.07 | 5.5             | 10k         | 0.25            | 1           | 76           | 84           | 25.6           | 180                 | ack-3      | rk      |
| 2.9  | 2.2             | 10k         | 12.1            | -           | 80           | 75           | 40             | 130                 | [3]        |         |
| 1.74 | 3.7             | 182         | 1.17            | 0.6         | >70          | >70          | 33             | 180                 | [13]       |         |
| 2.1  | 0.34            | 670         | 0.79            | 0.2/0.8     | 80           | 85           | 57.8           | 180                 | [14]       |         |
| 1.67 | 0.94            | 19.9k       | 3.9             | 1.5         | 80           | 78           | 40             | 130                 | [15]       |         |
| 6    | ე               | 10k         | ъ               | 0.5         | 80           | 95           | 71             | 65                  | [21]       |         |
| 3.8  | 6.36            | 10k         | 49              | 1.2         | 70           | 60           | 68             | 130                 | [27]       |         |
| 2.67 | 3.06            | 5.3k        | 7.6             | 2.8         | 75           | 66           | 41             | 500                 | [28]       |         |

Table 3.3: Performance Summary and Comparison with State-of-the-art LNAs



Figure 3.16: Measured NEF versus temperature.



Figure 3.17: Measured NEF of the stack-2 amplifier versus power supply.



Figure 3.18: Measured NEF of the stack-3 amplifier versus power supply .



Figure 3.19: Closed-loop amplifier NEF survey.

# Chapter 4

# Inverter Stacking Technique: Second Prototypes

## 4.1 Introduction

<sup>1</sup> Autonomous wireless sensor networks have been a prevailing research topic during the past few years. The large demand for low-voltage, highpower-efficiency portable electronic devices, such as biomedical sensor readout, industrial monitoring devices, and etc, provide the impetus for more research towards proposing better system architecture and more power efficient building blocks.

Conventionally, the LNIAs consume large power to meet the stringent noise requirements, and this amount of power cannot be scaled in advanced technology nodes, as digital circuits do. By contrast, the other two major power-hungry blocks, digital processing cores and RF transmission blocks, can be power-scaled with design techniques. The former can be duty cycled or waken up only when necessary, while the latter can transmit only the processed data, instead of the raw data, with the edge computing techniques to reduce

 $<sup>^{1}</sup>$ This work was first presented in VLSI symposium 2019 [70]. This work was done by the first author Linxiao Shen, and all the technical discussions with the co-authors are highly appreciated.



Figure 4.1: Closed-loop amplifier PEF survey.

the data-rate, and thus saving great amount of power. However, in all these scenarios, the LNIAs still need to be always on, and it is increasingly becoming the performance bottleneck, especially for the low power applications. For any LNIAs, the fundamental trade-offs exist among power, noise, and bandwidth. Considering the bandwidth being naturally extended to tens of kHz, which covers most biomedical and audio signals, the target for designers concentrates on minimizing the power consumption over the interested signal bandwidth.

There have been many excellent research works in the past aiming at mitigating the amplifier. The core idea is to boost the input transconductance, while not increasing power consumption. Current reuse is one of the most promising solutions. [31] realized mutli-time current by utilizing the orthogonality of current in different stacking layers. This technique, however, requires exponentially growing hardware complexity for more-time current reuse. In addition, this technique can be only used in multi-channel applications. [32] realized the multi-time current reuse with frequency division. By using different chopping frequency, the signal is first up-modulated to different frequency bands, and then amplified with the prior orthogonal current reuse ladder. Nevertheless, the technique still relies on the expensive exponentially growing peripheral circuity. The inverter-stacking technique of [39], however, streamlined the peripheral circuity and thus exhibits a linear dependence on the current reuse time (versus the exponential dependence of orthogonal current reuse), greatly enhancing power efficiency. It obtained the previously best measured noise-power trade-off, but this technique sacrifices the output swing. Recently, the squeezed-inverter-based amplifier of [42] presents the technique of operating an inverter under 0.2-V supply voltage. This work starts a new angle of achieving better power efficiency with low-voltage operation. However, the robust muti-supplies operation requires additional power management circuity, and thus increases the circuit complexity and power consumption. As shown in Fig. 4.1, a clear frontier of PEF with these techniques is bounded at 1, which is the power efficiency of a single bi-polar OTA. However, higher power efficiency is always desired.

This work presents a low-noise chopper instrumentation amplifier (LN-CIA), that achieves a power efficiency factor (PEF) of 0.96 with a tail-less inverter-stacking input stage, followed by a low-common-mode(CM)-gain sec-

ond stage, and a linear class-AB output stage. Majority current is allocated to the first stage to suppress the input referred noise. Four-time current reuse is achieved by vertically stacking two inverter-based input amplifiers, significantly reducing the required power consumption. To compensate the common mode rejection ratio (CMRR) degradation from the tail-less structure, techniques including CM pre-filtering, and inverter stacking technique are employed to suppress the CM-DM conversion from the input pairs: CM pre-filtering loop is used to improve the CMRR by reducing the amplitude of CM interference at the outside two pairs, while the middle two pairs are source degenerated by high impedance due to the nature of the inverter stacking structure. Globally, chopping technique is employed to up-modulate the CM to differential-mode (DM) interference. Moreover, closed loop configuration improves the CMRR with its high loop gain, which is enhanced with the cascaded three-stage topology. The class-AB output stage is designed to have high output swing and good drive-ability, while still keeping the static current consumption low. Dominant pole naturally exists at the output of the last stage, stabilizing the whole three-stage closed-loop amplifier. All the three stages are connected via AC coupling to block any dc offset. Chopping ripple is suppressed by the filtering effect of the passive AC coupling capacitor before the de-chopper.

A prototype amplifier equipped with the proposed architectures was fabricated in 180-nm process. The Measured integrated noise power within 8-kHz bandwidth (BW) is  $1.38-\mu V_{rms}$ , with  $2.7-\mu W$  total power consumption,
leading to a PEF of 0.96. The peak CMRR and PSRR are measured to be 84 dB and 78.8 dB, respectively, which validates the performance enhancing techniques.

Section 4.2 presents the structure of the proposed tail-less inverter stacking input stage and its trade-offs. Section 4.3 describes the global dominantpole compensated three-stage architecture. In Section 4.4, some design details, including CMRR enhancement and ripple reduction techniques are discussed. Section 4.5 presents the measured results, and Section 4.7 concludes this paper.

# 4.2 Proposed Chopping Tail-less Inverter Stacking Input stage

### 4.2.1 Concept

Inverter-stacking topology to boost current efficiency was first introduced in [39]. By reusing the bias current through all the vertically stacked input differential pairs, the input transconductance is significantly boosted without much power overhead. However, the heavily stacked structure limits the minimally allowable supply voltage, making it difficult to operate at low supply voltage. For four-time current reuse, a supply voltage of at least  $6V_{ov}$  is required, which translates to 0.9 V with a reasonable overdrive voltage requirement.

To address this issue, we adopt the tail-less inverter-stacking topology – Conceptually depicted in Fig. 4.2, which circumvents the aforementioned limitations by realizing the inverter-stacking structure in pseudo-differential mode.



Figure 4.2: Tail-less inverter-stacking input stage signal path.

For the sake of simplicity, Fig. 4.2 shows the signal path of the single-ended half-circuit of the pseudo-differential inverter-stacking input stage. The input pairs are split into four vertically stacked smaller pieces, each receiving the same input signal via AC coupling capacitor  $C_s$ . The input is capacitivelycoupled through input capacitor  $C_s$  to all the four differential input nodes, achieving four-time current reuse. Instead of using the common-gate transistors to sum up the signal current, the output is summed up capacitively as well. With output passively summed up, the current efficiency gets boosted. In the absence of the tail current sources, only N transistors are stacked for N-time current reuse, greatly improving the supply voltage usage. Comparison with prior works shows the advantage of using the proposed tail-less inverter stacking input stage. Although in [42], squeezing the supply voltage of its first stage to 0.2 V requires many additional peripheral circuits, including the DC-DC converter to generate the separate supply, the negative voltage generator to support the complicated bias method. All these requirements counteract the benefit from lowering supply voltage. In contract, for the tail-less inverter stacking input stage, the whole amplifier operates under a global supply voltage, while streamlining the peripheral biasing circuits. Thus, it can significantly reduce the circuit complexity to enhance the global power efficiency factor.

#### 4.2.2 Biasing network

An important function of the tail current source is to set the bias current. Alternative methods are needed to provide robust current bias. As shown in Fig. 4.3, input differential pairs are categorized into two groups: the middle two pairs, whose gate voltages are strongly coupled; and the other two outside pairs ( $M_1$  and  $M_4$ ). To minimize the supply-voltage requirement, the four input pairs are separately biased. The bias voltage for the middle two pairs need special considerations, since it consists of  $V_{gs2}$  and  $V_{gs3}$ , and directly sets the DC bias current. To ensure PVT robustness, replica-based bias [39] is employed as shown in Fig. 4.3(a). The two back-to-back current mirrors are used to set the desired bias current though the middle two differential pairs. The bias current in the main branch can be accurately set simply with the current mirror ratio. In this way, the PVT variance in threshold voltage can be well tracked through the replica. In addition, the negative feedback loop regulates the voltage of the internal node B to be around  $V_{CM} = V_{DD}/2$ .

It is also critical to balance the current from the top PMOS  $(M_4)$  and bottom NMOS  $(M_1)$  to the middle current pairs  $(M_2/M_3)$ . The input stage can be viewed as vertically stacked inverter-based amplifiers, which share the same bias current. Both amplifiers need to be biased to make sure the current from PMOS tracks the NMOS, and thus the output voltage can be set at desired voltage level. So negative feedback loops (CM-pre-filtering loops) are implemented to regulates the output voltage level, which are shown in Fig. 4.3(b) and Fig. 4.3(c). The output voltage is compared with reference voltage and the low frequency part of the amplified voltage difference is fed back to the adjust the DC bias voltage of the outside differential pairs  $(M_1/M_4)$ . In such way, the DC current in  $M_1$  and  $M_4$  tracks the current set by the middle pairs, and no further common-mode feedback (CMFB) circuity is needed.

#### 4.2.3 Signal gain, input referred noise

The total amplifier transconductance  $g_{mt}$  can be derived as:

$$g_{mt} = 2g_{mn} + 2g_{mp} \approx 4g_m \tag{4.1}$$

In deriving (4.1), we assume all input pairs have similar  $g_m$ , and  $g_m \cdot r_o \gg 1$ . Unless otherwise noted, the same assumptions are made for all later derivations. In differential mode (DM) operation, the internal node B serves as the virtual ground. Thus, the top and bottom inverter-based amplifier can be de-coupled and analyzed individually. The top amplifier can be be simplified as shown in Fig. 3, where the NMOS  $(M_N)$  is biased through the replica branch and the PMOS  $(M_P)$  is biased through the feedback loop. The bottom amplifier is the same, but with mirrored NMOS and PMOS. Signal gain can be derived by superposition, and the signal path from NMOS and PMOS are shown in Fig. 3(a) and (b), respectively. The transfer function from  $M_N$  to its output can be derived as:

$$\frac{v_o}{v_{in}} = -\frac{1 + sC_sR_f}{(1 + sC_sR_f) \cdot \frac{1}{r_{on}//r_{op}} \cdot g_{mn} + \frac{g_{mp}}{g_{mn}} \cdot A_f}$$
(4.2)

The transfer function from  $M_P$  to its output can be derived as:

$$\frac{v_o}{v_{ip}} = -\frac{sC_sR_f \cdot g_{mp} \cdot (r_{on}//r_{op})}{1 + A_f \cdot g_{mp} \cdot (r_{on}//r_{op}) + sC_sR_f}$$
(4.3)

The transfer functions show high pass property, mainly due to the existence of the CM pre-filtering loops. With the pseudo-differential structure, the low-frequency portion from both the DM signals and CM interference will be filtered. This special property will be utilized to enhance the CMRR performance, when combing with chopping technique, which differentiates CM and DM signal paths. The detail will be in Section 4.4.

By combining all the signal gain from the four paths, the overall passband gain can be written as

$$\frac{v_o}{v_i} = -(g_{mn} + g_{mp}) \cdot (r_{on}//r_{op})/2 \approx g_m \cdot r_o/2$$
(4.4)

This shows that its DM gain is close to single transistor intrinsic gain, while requiring supply voltage same as the cascode topology.

The input referred thermal noise from the input pairs can be derived as

$$N_{PSD,th} = \frac{8kT\gamma \cdot (g_{m1} + g_{m2} + g_{m3} + g_{m4})}{(g_{m1} + g_{m2} + g_{m3} + g_{m4})^2} \approx \frac{1}{4} \cdot \frac{8kT\gamma}{g_m}$$
(4.5)

The overall thermal noise is proven to be reduced to one quarter.

#### 4.2.4 Trade-off discussion

There are several potential limitations of the proposed tail-less inverterstacking input stage. First, as it stacks four transistors, there is no more room for the cascode devices in low-voltage applications. As analyzed above, the input stage gain is only about the transistor intrinsic gain. With the scaling down of devices sizes and supply voltages, single-stage cascode or telescopic amplifier are not suitable anymore. A low-power, frequency-compensated multi-stage amplifier is necessity. Second, the pseudo-differential topology of  $M_1$  and  $M_4$ does not provide any CM source degeneration, and therefore the input stage behaves poorly on the CM-DM conversion if it faces directly with the CM interference. Additional CMRR enhancement techniques are required.

## 4.3 Three-stage dominant-pole compensated amplifier

Although inverter-stacking topology achieves high power efficiency, it worsens the trade-off between supply voltage and gain. Inverters are stacked vertically, occupying too much headroom, which is not compatible with the cascode structure for low supply designs. In the meanwhile, boosting gain through cascode is also against the spec of output swing in the output stage. To address the dilemma, we resort to the cascading topology, where a threestage OTA is designed. With appropriate gain and bias distribution among amplifier stages, each stage can be individually optimized and consequently the power-noise trade-off is mitigated. By cascading three stages, the overall gain can also be boosted.

The overall block diagram of the proposed chopper amplifier is shown in Fig. 4.5. Globally, chopping is employed to reduce the offset and flicker noise in the first two stages. The offset and flicker noise from the last stage can be greatly attenuated when input referred.

As mentioned earlier, the core of the first stage  $(g_{m1})$  employs the tailless inverter stacking topology. Replica-based bias branch and CM pre-filtering loop  $(g_{mf})$  provide PVT robust biasing. Its outputs are capacitively summed up and AC-coupled to the second stage  $(g_{m2})$ .

The schematic of second stage and third stage are shown in Fig. 4.6 and Fig. 4.7, respectively. Positive feedback-based negative load is applied to the second stage to boost its signal gain. The positive load is intentionally made weaker than the negative load. Another benefit of the structure is to save additional CM DC bias branch, due to its CM low impedance property.

A class-AB output stage  $(g_{m3})$  is used to ensure large output swing. AC coupling is used to connect the two stages. The NMOS pair is current-mode biased to set the DC bias current as desired. The CMFB is applied to bias the PMOS pair, so that the bias current in PMOS can track that in NMOS. The schematic of the OTA in the CMFB is shown in the Fig. 4.7(b), where the differential output is averaged out by a pair of pseudo resistor and capacitor.

To connect the three-stage amplifier into closed-loop configuration, stability is another concern. Dominate-pole compensation fits well in this low noise application by making use of its power allocations. To suppress the thermal noise, the first stage consumes most of the power, and thus exhibits a relatively low output impedance, leading to a high-frequency non-dominant pole. In contract, the class-AB output stage can be designed to be low power, and therefore stabilize the overall closed-loop configuration with its high output impedance. In the meanwhile, the output swing can be greatly extended, while still keeping the benefit of the high power efficiency from the inverterstacking topology.

# 4.4 Design details and discussions

#### 4.4.1 CMRR enhancement

Fig. 4.8 shows the block diagram of the single-ended half-circuit of the differential three-stage OTA. The input chopper up-modulates the signal,

while leaving the CM interference at its original frequency band. Due to the capacitive network, the signal and the fedback output are attenuated naturally and the transfer functions can be written as:

$$FF = \frac{C_S}{C_S + C_{gg1} + C_F + C_P}$$
(4.6)

$$FB = \frac{C_F}{C_S + C_{gg1} + C_F + C_P}$$
(4.7)

where  $C_{P1}$  is the sum of the parasitic capacitance from both  $C_S$  and  $C_F$ . After the DM virtual ground node, the DM and CM paths diverge and the gain is decoupled into CM-DM gain ( $A_{c-d}$  and DM-DM gain ( $A_d$ ). These are modelled as constant across interested frequency band for the sake of simplicity. Here, the CM loop gain is considered to be zero, due to the fully differential second stage with positive feedback load. Therefore, only DM signal components can be fedback.

The CM interference is converted into DM with the  $A_{c-d}$ , and then passes through the feedback network to the DM virtual ground node. Due to the property of the negative feedback, the CM-DM conversed interference will be partially cancelled out and CMRR can be improved. The overall closed-loop CM-DM conversion gain  $A_{c-d,cl}$  can be simply derived as:

$$A_{c-d,cl} = \frac{A_{c-d}}{A_d \cdot \frac{C_F}{C_S}} \tag{4.8}$$

It should be noted that the closed-loop structure only is not sufficient for reasonable CMRR with the tail-less inverter-stacking input stage. To improve CMRR, we also need to reduce the open-loop CM-DM conversion gain  $A_{c-d}$ in the first place, which can be written as:

$$A_{c-d} = (A_{c-d,1} \times A_{d,2} + A_{c,1} \times A_{c-d,2}) \times A_{d,3}$$
(4.9)

To compensate the degradation from the tail-less inverter-stacking structure, two techniques are used to individually improve the CM-DM conversion happened in different input differential pairs.

On the one hand, it is obvious that the outside two differential pairs  $(M_1 \text{ and } M_4)$  lose the source degeneration when the circuit goes with the tail-less structure. The block diagram of the two pairs and CM pre-filtering bias loop is shown in Fig. 4.9(a). The input transconductance and feedback OTA are notated as  $g_{m1}$  and  $A_f$ . The input is coupled to the transistor gate capacitively, and its DC bias level is set with the feedback. With chopping, the CM and DM paths are de-coupled and only the CM of  $V_{in}$  is attenuated by the high-pass loop, while the up-modulated DM signal passes through in the its pass-band. Therefore the corner frequency should be set neither too low to mitigate the CM filtering effect, nor too high to attenuate the up-modulated signal. With conventional open-loop passive RC network, the high-pass corner is set as

$$f_o = \frac{1}{2\pi \cdot C_S \cdot R_F} \tag{4.10}$$

However, to achieve a reasonable attenuation for 50/60-Hz interference, a 100-Hz corner frequency is desired, which translates to a too large RC time constant ( $\tau = 15.9mS$ ), which is too expensive to implement on chip. By contrast, the CM pre-filtering loop extends the corner frequency. The increased time constant makes the use of pseudo-resistor possible, which is the economic and feasible method to achieve a large on-chip resistor.

On the other hand, the middle two differential pairs  $(M_2 \text{ and } M_3)$  still have the source degeneration, although it might counter-intuitive at the initial thought. The schematic of the CM half-circuit of the input stage is shown in Fig. 4.8(a). The input to  $M_1$  and  $M_4$  is grounded. The CM interference is capacitively coupled to the transistor gates of  $M_2$  and  $M_3$ . At node B, the current flowing out of the  $M_3$  should be equal to the current into  $M_2$ , and thus the source node voltage should track the CM interference at the gate to keep both  $V_{gs}$  unchanged. This property creates a high impedance for both the  $M_2$ and  $M_3$ , no matter if the overall circuit is pseudo- or fully-differential. The small signal model is also shown in Fig. 4.8(b). The transfer function from CM interference to the node B can be simply derived as

$$H_{CM-B} = \frac{g_m r_o}{1 + g_m r_o}$$
(4.11)

This high impedance degenerates both the middle pairs and enhance the CMRR performance.

Another source of CM-DM interference comes from the down conversion from the interference close to the chopping frequency. Careful layout and the closed-loop configuration helps to reduce the conversion to make sure the performance.

#### 4.4.2 Offset and ripple reduction

Based on the behavior of the offset, the offset from the first two stages shows as ripple at the chopping frequency, while the offset from the last stage as offset. There is always concerns on how to reduce offsets and chopping ripples for the closed-loop chopping structure. Unlike the conventional capacitively-feedback structure, where DC feedback path is blocked, the chopping de-couples the closed-loop. The offset from the last stage is up-modulated and forms a closed loop. However, the first two stages still do not have DC feedback.

Conventionally, ripple reduction loop [43] is required. The basic idea is to form an additional DC loop to suppress the DC offset. Instead, we chose to isolate the three stages and deal with the offset individually. The overall block diagram is shown in Fig. 4.10. All the three stages are connected via AC coupling network. The offset in the first two stages is only amplified by its own stage, and then blocked by the AC coupling capacitor. In order not to saturate the output, the first two stages are designed to have relatively lower gain (around 26 dB for each), while the last stage to have a gain of over 32 dB. With this configuration, the signal amplitude in the first two stage outputs are all small enough, thus allowing enough headroom for the amplified offset. In contrast, the offset in the last stage is suppressed by the negative feedback loop: any output offset will be up-modulated to pass through the capacitive feedback network and thus can be significantly attenuated by the high loop gain.

#### 4.5 Measurement Results

The proposed three-stage chopper tail-less inverter-stacking amplifier is implemented in a 0.18  $\mu$ m CMOS process. The tail-less inverter-stacking input stage, the second stage amplifier, the current-mode biased class-AB output stage, the choppers. and the capacitive networks are shown on the die photo (Fig. 4.11). The circuit chip core occupies an area of 0.14  $mm^2$ , mainly dominated by capacitors. The input and feedback capacitor are 4 pF and 20 fF, respectively.

The measured AC frequency response, the CM-DM conversion, and the the supply-DM conversion are shown in the blue, red, and yellow lines in Fig. 4.12, respectively. By sweeping the frequency of the test tone, the whole spectrum transfer function is plot. The small signal flat-band gain was measured to be 46 dB, over the -3-dB bandwidth from DC to 8 kHz. To measure the CM-DM and supply-DM conversion gain, the differential inputs to the LNIA are shorted together. The CM-DM conversion gain was measured by dividing the converted DM interference from the CM interference input amplitude, as shown in red. Moreover, the supply-DM conversion is also measured by measuring the transfer function from the supply to the output terminals. The CMRR and PSRR are measured to be over 84 dB and 78 dB, respectively, at low frequency band, including the 50/60 Hz interference.

One global supply voltage of 0.6 V is used and its overall measured power consumption is 2.7  $\mu W$ . Fig. 4.13 plots the measured input referred noise, mainly determined by thermal noise. The chopping effectively reduces the flicker noise corner to sub-10 Hz. The total integrated rms input referred noise over the signal band (DC to 8 kHz) is 1.38  $\mu V$ . These results translate to a PEF of 0.96.

Fig. 4.14 shows the measured total harmonic distortion (THD) at different amplitudes. The amplifier exhibits a 0.7% THD at 400 Hz and 50% full-scale output amplitude.

## 4.6 Comparison to Other LNA works

To verify the robustness against temperature and supply voltage variations, the testing chips are also measured in the sweeping temperature and supply environment. The IRN, total current consumption, and PEF are three specifications used to characterize the performance. Fig. 4.15 shows temperature sweeping measurement from -20 to 80°C for the prototype. Above 0°C, the current consumption and IRN is nearly constant due to the robust currentmode biasing scheme in the circuit. The IRN and PEF began to degrade as expected, when the prototype chips are pushed to below -0°C. The main reason is that the replica-based biasing branch minimally requires  $4V_{ov} + 2V_{th}$ . When the threshold voltage increases in low temperature, the PMOS on top and NMOS on bottom are pushed into linear region and therefore the bias current decreases, as shown in measurement results. However, the overall PEF variation is still within 20%, in such a wide temperature range. Fig. 4.16 shows the supply-voltage sweeping measurement results. Similar to the temperature sweeping results, the performance is constant with a over-0.55-V supply voltage. Same as the analysis in the temperature sweeping, the reason for the degradation also comes from the replica-based biasing branch. Overall, the prototype achieves reasonably consistent PEF performance.

Table 4.1 summarizes the performance of the prototype amplifier and compares it with the state-of-the-art closed-loop amplifier works with comparable specifications. With the proposed tail-less inverter-stacking technique, this paper presents a below-1 PEF with silicon and sets a new frontier between the noise and power.

# 4.7 Conclusion

This paper presents a novel low-noise instrumentation amplifier with high power efficiency. The measurement results demonstrate the feasibility of operating OTAs with a high power efficiency under a globally low supply voltage. The design trade-offs between supply voltage, power, noise, gain, linearity, and output swing are carefully addressed with chopping, tail-less inverter-stacking input stage, dominant-pole compensated three-stage structure, and high-swing class-AB output stage. Moreover, CM-pre-filtering biasing loop, and the high-impedance source degeneration naturally coming with the inverter-stacking structure enhance the CMRR performance. In all, the proposed LNCIA shows consistent, better-than-bipolar PEF performance across a wide temperature and supply voltage range. It can be a perfect fit to the various applications where low-supply and high power efficiency are desired.



Figure 4.3: (a) Replica based biasing scheme for the tail-less inverter stacking input stage. (b) CM pre-filtering biasing loop for top amplifier. (c) CM pre-filtering biasing loop for bottom amplifier.



Figure 4.4: (a) Transfer function from NMOS outside the loop. (b) Transfer function from the PMOS in the loop.



Figure 4.5: Block diagram of the three-stage dominant-pole compensated amplifier.



Figure 4.6: Schematic of the second stage.



Figure 4.7: (a) Schematic of output stage.



Figure 4.8: Block diagram of the single-ended half-circuit of the fully differential closed-loop OTA for CMRR analysis.



Figure 4.9: (a) Block diagram of the single-ended half-circuit of the CM prefiltering loop. (b) Transfer function of the CM pre-filtering loop.



Figure 4.10: Offset and ripple reduction analysis.



Figure 4.11: Chip Microphotograph.



Figure 4.12: Measured OTA transfer functions: gain transfer function, CM-DM conversion transfer function, and supply-DM transfer function.



Figure 4.13: Measured OTA input referred noise.



Figure 4.14: Measured OTA THD at different input amplitudes.

| PEF  | IRN (uVrms) | BW (Hz) | Power (uW) | VDD (V) | THD (%)    |       | PSRR (dB) | CMRR (dB) | Gain (dB) | Technology (nm) |           |
|------|-------------|---------|------------|---------|------------|-------|-----------|-----------|-----------|-----------------|-----------|
| 1.26 | 6.7         | 10k     | 0.23       | 0.9     | @3.6mVpp   | 0.6%  | 79        | 84        | 46        | 180             | This work |
| 4    | 0.9         | 19.9k   | 3.9        | 1.5     | @ 16.5mVpp | 1%    | 80        | 78        | 40        | 130             | [13]      |
| 1.9  | 1.5         | 804     | 0.27       | -       | @1mVpp     | 0.54% | 92        | 89        | 59        | 180             | [14]      |
| 1.6  | 0.3         | 670     | 0.79       | 0.2/0.8 | @1.5Vpp    | 0.3%  | 80        | 85        | 58        | 180             | [12]      |
| 1.14 | 5.5         | 10k     | 0.25       | -       | NA         |       | 76        | 84        | 26        | 180             | [15]      |
| 4.4  | 2.1         | 5k      | 3.24       | 1.8     | @5mVpp     | 0.09% | N/A       | N/A       | 40        | 180             | [10]      |
| 11   | 6.7         | 100     | 1.8        | -       | WA         |       | 120       | 134       | 40        | 65              | [16]      |

Table 4.1: Performance Summary and Comparison with State-of-the-art LNAs



Figure 4.15: Measured performances versus temperature.



Figure 4.16: Measured performances versus supply voltage.

# Chapter 5

# CT-SAR-assisted kT/C noise-free Nyquist ADC Design: Third Prototype

# 5.1 Introduction

<sup>1</sup> A discrete-time (DT) ADC has a front-end S/H circuit. Fig. 5.1(a) shows an example of a classic two-step SAR ADC with the front-end sampler. The benefit of having a front-end S/H circuit is that it converts a continuous-time (CT) input into a DT signal that stays unchanged between samples, which simplifies the following analog signal processing operations (e.g., quantization, subtraction, and amplification). Nevertheless, the use of the front-end S/H brings an unwanted sampling noise, which poses a fundamental SNR limit for the ADC. This sampling noise is typically suppressed passively by increasing the capacitor size [3–15]. For example, the total differential sampling capacitors need to be greater than 2.1 pF to achieve a sampling noise limited SNR of 80 dB with a 2.5-V peak-to-peak differential signal swing, and it has to be quadrupled for every 1-bit increase in the resolution. Such a large input capacitor makes it difficult to design the S/H circuit, and leads to increased ADC power and area. Moreover, as indicated in Fig. 5.1(a), a large input

<sup>&</sup>lt;sup>1</sup>This work was first presented in ISSCC 2019 [52]. This work was done by the first author Linxiao Shen and thanks much for all the co-authors' technical discussion.



Figure 5.1: (a) Conventional DT two-step SAR ADC; (b) Proposed CT-SARassisted two-step SAR ADC.

capacitor poses critical challenges for both the ADC buffer and the reference buffer. To meet the stringent linearity requirement of high-resolution ADCs, these buffers consume a significant amount of power, which can be comparable or even higher than the ADC itself. Thus, it is highly desirable to look for a way to break this tight trade-off between the sampling noise and the capacitor size, so that a small input capacitor can be used but without incurring significant noise penalty.

## 5.2 Low-Noise Amplifier Design: Challenges

The core idea explored by this work is to remove the front-end S/H circuit and operate the first stage of the two-step ADC in the CT domain 44, 45]. Fig. 5.1(b) shows the basic architecture of the proposed ADC. In the 1ststage CT SAR, the in-band thermal noise due to the switch resistance is much smaller than kT/C, as its sampling-free CT operation prevents wide-band noise folding, which otherwise would be the dominant contributor to the sampling noise. The inter-stage amplifier also processes the 1st-stage conversion residue in CT, and acts as a low-pass filter that suppresses the out-of-band thermal noise from the switch resistors. Consequently, the 1st-stage capacitor size is not bounded by the kT/C limit and can be significantly reduced. Unlike the classic two-step ADC, the sampling operation is moved to the 2nd stage, as shown in Fig. 5.1(b). Although the 2nd stage still suffers from its own sampling noise, this noise is substantially suppressed by the inter-stage gain when input referred, thus permitting the use of a small capacitor in the 2nd stage too. Overall, the use of the S/H-free CT 1st-stage breaks the link between the sampling noise and the capacitor size, making it possible to design highresolution Nyquist-rate ADCs with small capacitors.

Although removing the S/H circuit brings the benefit of suppressed sampling noise, it can cause the 1st-stage conversion residue to go out of bound. Fig. 5.2(a) shows a simplified model of the two-step ADC.  $D_i$  represents the digital output code of the  $i^{th}$  stage.  $e_{qi}$  represents the quantization noise added of the  $i^{th}$  stage.  $e_{amp}$  represents the noise of the inter-stage amplifier. The input signal  $V_{in}$  goes through two different paths in the 1st stage. In the upper fast path, the signal directly goes to the inter-stage amplifier. In the lower slow path, the signal goes through the sub-ADC, the sub-DAC, and the analog subtractor. All of these analog signal processing steps introduce extra delays, which are lumped into a single block  $\tau$ . In a classic two-step ADC with the S/H circuit, this path delay mismatch is not an issue because the sampled input does not change. However, for the S/H free CT 1st-stage, this delay mismatch can cause output signals from the two paths to be misaligned, resulting in a much larger conversion residue and potentially saturating the inter-stage amplifier and the 2nd-stage ADC.

# 5.3 Low-Noise Amplifier Design: Architectures5.3.1 Inverter-Based Architectures

One way to address this delay mismatch problem is to insert a negative delay  $(-\tau)$  block in the slow path to cancel the positive delay [44], as shown in Fig. 5.2(b). Even though a pure negative delay is non-causal and unpractical, a negative delay within the signal band can be realized by using an analog prediction filter. However, it requires power-hungry wide-band op-amps to provide full-signal-band prediction. Moreover, to satisfy causality, the analog prediction filter introduces positive delay for out-of-band high-frequency signals. This increases the delay mismatch between the two paths, causing the 1st-stage conversion residue to increase for out-of-band signals. Hence, to keep the 1st-stage conversion residue within the allowable range, the antialiasing filter needs to provide stronger attenuation for out-of-band signals, which makes its design more challenging. Another approach to tackle the delay mismatch issue is to insert a positive delay block in the fast path [45], as shown in Fig. 5.2(c). This positive delay can be realized by using an LC lattice filter. This approach is fully passive and power efficient. Nevertheless, since it relies on delay matching, it requires careful tuning to compensate for process and temperature variations. Furthermore, while it works well for high-speed GHz operations, it is not well suited for low-to-medium speed applications (e.g., sensors), as the required long delay could result in large LC values and excessive chip area. In addition, it is non-trivial to design the filter to avoid amplitude attenuation or phase modulation.

#### 5.3.2 Orthognal Current-reuse Architectures

This work seeks to address the delay mismatch problem in a different way. In a conventional two-step ADC, the slow quantization path consists of a multi-bit flash, a multi-bit DAC, and an analog subtractor, whose total aggregated delay tends to be relatively large. By contrast, the proposed ADC uses a CT-SAR based 1st-stage, as shown in Fig. 5.2(d). Since each SAR cycle contains only a single-bit comparison, a single-bit DAC, and a built-in subtraction operation, its delay is much shorter.  $e_{q1,i}$  represents the quantization noise added during the  $i^{th}$  iteration. Even though a multi-bit quantization (e.g., 7-bit) requires multiple SAR cycles, their delays do not accumulate. This is because the CT-SAR operates on the CT input. Every SAR cycle sees its



Figure 5.2: Block diagrams of CT pipeline ADC with (a) delay mismatch; (b) negative delay added to the slow path; (c) positive delay added to the fast path; and (d) proposed CT-SAR with shortened delay and built-in redundancy.

new instantaneous input. To further reduce the delay, the CT SAR adopts asynchronous clocking [46] and dynamic logic [47, 48]. As the input moves during the CT SAR operation, the prior SAR comparator decisions can be no longer correct. To tackle this problem, sufficient redundancy is prepared in each bit such that the CT SAR can still tightly track the input. Once the CT-SAR finishes, the conversion residue is readily available. Thus, it allows the dynamic amplifier to be triggered immediately, minimizing delay. With these techniques applied, the overall delay mismatch can be made very short. Hence, no prediction filter or LC lattice filter is needed in this work, leading to lower design complexity, as well as reduced chip area and power. In addition, different from prior CT pipelined ADCs [44, 45] that use input resistor and current-source DAC, this work uses input capacitor and capacitive DAC. Capacitors are noise-free and do not consume static current. As a result, the proposed ADC is more efficient from both noise and power perspectives.

To verify the proposed techniques, a 13-bit prototype ADC is built in 40nm CMOS process. Its input capacitor is only 120 fF, which is over 20 times smaller than what would be needed in a classic DT two-step ADC. The inter-stage amplifier in this work adopts a floating inverter based (FIB) dynamic amplifier (DA) topology. Comparing to classic static amplifiers, the proposed FIB DA is low-power and low-noise. Comparing to the integrator based DA [9, 49–51], it provides higher gain and stronger rejection to the input common-mode variations. Operating at 2 MS/s, the ADC achieves 72dB SNDR across the Nyquist band while consuming only 25  $\mu$ W of power and 0.01 mm<sup>2</sup> of area.

This paper is an extension of [52] and is organized as follows. Section
5.4 presents the operation principle of the proposed ADC. Section 5.5 presents the implementation details of the prototype ADC. The measurement results are in Section 5.7. Finally, Section 5.9 draws the conclusion.



Figure 5.3: Architectural block diagram of (a) the proposed CT-SAR-assisted two-step SAR ADC; (b) 1-st stage CT-SAR ADC; and (c) example waveforms for key nodes.



Figure 5.4: Input and DAC output example waveform for (a) conventional DT-SAR ADC; (b) large  $E_{slope}$  for CT-SAR ADC by simply removing S/H circuits; (c) recovered residue voltage with accelerated SAR conversion; (d) recovered residue voltage with built-in redundancy.

# 5.4 Proposed Two-Step ADC with 1st-Stage CT-SAR5.4.1 Topology overview

Fig. 5.3(a) shows the block diagram of the proposed CT-SAR assisted two-step ADC. Its 1st stage works in the CT domain while the 2nd stage works in the DT domain. The 1st stage CT SAR performs a CT approximation of the input and produces a CT residue. The inter-stage amplifier amplifies the residue and also filters out the wide-band thermal noise due to its low-pass response. The sampler comes after the amplifier and converts the signal into the DT domain for the 2nd-stage DT SAR. As mentioned earlier, even though this sampler produces sampling noise, since it is after the inter-stage gain block, this noise is greatly attenuated when input referred. The final ADC output is the weighted sum of digital outputs from the 1st-stage CT SAR and the 2nd-stage DT DAR.

In a classic DT SAR, a single capacitor array is used for both input sampling and SAR DAC in a time-duplex manner. The benefits are fewer number of capacitors and no signal attenuation. However, this scheme only works for the DT SAR with the S/H circuit. To realize the CT SAR and ensure simultaneous input tracking and SAR conversion, the input capacitor and the SAR DAC have to be separated, as shown in Fig. 5.3(b). The cost is more capacitors and an attenuation factor of  $\beta \equiv C_{IN}/C_T$  from  $V_{in}$  to  $V_{res}$ , where  $C_T$  represents all the capacitance at  $V_{res}$ , including  $C_{IN}$ ,  $C_{DAC}$ , and the parasitic capacitance (including the input capacitance of the comparator and the amplifier). The signal attenuation results in an increased noise contribution from the inter-stage amplifier. Thanks to the reduced 1st-stage sampling noise, the inter-stage amplifier can produce more noise, while keeping the total ADC input referred noise unchanged. As a result, the increase in the amplifier power is not significant. Overall, this cost is worthwhile as the sampling noise can be greatly suppressed and the capacitor size can be substantially reduced, which can lead to not only significant area savings for the ADC core itself but also potential power savings for the input and reference buffers. The detailed noise analysis is given in Appendix 5.6.

As shown in Fig. 5.3(c), the CT SAR performs a CT approximation for the input. The residue  $V_{res}$ , which is the difference between the input and the SAR DAC output, is updated during every SAR cycle. Since the comparator always sees the CT input, the delays from different SAR cycles do not accumulate, as long as the CT SAR can accurately track  $V_{in}$ . The final CT SAR conversion result  $D_1$  corresponds to the instantaneous ADC input at the time of the LSB comparison, rather than the MSB comparison. Once the CT SAR finishes, the residue  $V_{res}$  is readily available at the comparator input, and thus, the inter-stage amplifier can be triggered immediately. This helps ensure the close match between  $V_{in}$  and  $D_1$ . Any extra delay can cause the increase of  $V_{res}$  due to the variation in  $V_{in}(t)$ .

Unlike prior CT pipeline works that use input resistive coupling and a current-source DAC (IDAC) [44,45], this work uses input capacitive coupling and a capacitor DAC (CDAC). It eliminates the noise from the input resistor and the IDAC. It also removes the static power from the IDAC. Yet, the trade-

off is that capacitive coupling blocks DC signals. A pseudo-resistor  $R_B$  is used to provide the DC bias for the comparator input [53, 54].  $R_B$  and the capacitors form a high-pass filter that blocks the low-frequency input. The proposed capacitively-coupled two-step ADC can be used for a wide range of applications where the information does not reside at DC, such as audio, biological, and communication signals by setting the proper pass band frequencies.

#### 5.4.2 CT SAR conversion error and mitigation

Unlike a DT SAR, the input varies with time during one CT SAR conversion process, which can cause extra conversion error. To illustrate this key difference, Fig. 5.4(a) shows the case for a conventional DT SAR. The DAC output is compared to a sampled input, which does not change during the SAR conversion. As a result, the sampled input can be precisely converted regardless of how different the real-time input deviates from the sampled one. The conversion error is limited only by the quantization step, if we ignore other circuit non-idealities (e.g., capacitor mismatch, comparator noise, DAC settling error, etc). The S/H circuit essentially isolates the CT input from the SAR conversion process.

By contrast, without the S/H, the CT SAR is fully exposed to the CT input, whose variation can cause large error as the SAR conversion process no longer has a consistent convergence target. Fig. 5.4(b) shows an example that a rising input signal happens to be slightly below the decision threshold of the MSB comparison. The comparator outputs a '0', and directs the following

binary searches to be below the MSB decision threshold. However, the input actually rises above the MSB threshold as time goes by. As a result, the DAC output would fail to track the time-varying input, leading to a large conversion error. Qualitatively speaking, the conversion residue  $V_{res}$  (i.e., the conversion error in the 1st stage) can be considered as having two components:

$$V_{res} = E_q + E_{slope} \tag{5.1}$$

where  $E_q$  represents the quantization error, which is the same as in the DT SAR.  $E_{slope}$  represents the additional error caused by the input variation, which is unique in the CT SAR.  $E_{slope}$  happens when the CT input crosses a particular decision threshold and moves in the opposite direction with the subsequent binary search. Assuming the DAC has no redundancy, and such threshold crossing happens at the (N - k + 1)-th comparison (N is the total number of SAR comparisons),  $E_{slope}$  can be approximated by the input signal variation from the (N - k + 1)-th comparison to the end of the LSB comparison:

$$E_{slope} \approx |\Delta V_{in}| \le k \cdot T_{SAR} \cdot A \cdot 2\pi \cdot f_{in} \tag{5.2}$$

where  $T_{SAR}$  represents a single SAR cycle time, A and  $f_{in}$  are the input signal amplitude and frequency, respectively. The worst case  $E_{slope}$  happens at the MSB decision with k = N, as  $V_{in}$  has the longest time to drift away from the critical decision threshold. To minimize  $E_{slope}$ , it is preferred to reduce N, but this would reduce the CT SAR resolution, leading to increased  $E_q$ . An effective way to reduce  $E_{slope}$  is to reduce  $T_{SAR}$ , as shown in Fig. 5.4(c).



Figure 5.5: SNR versus  $T_{SAR}$  and number of conversion cycles (N). To this end, dynamic logic [4,7] and asynchronous clocking [46] can be used. CMOS scaling also helps as it naturally decreases  $T_{SAR}$ .

Fig. 5.5 plots the simulated SNR versus N for the CT SAR, assuming a full-swing 1-MHz input. Here the SNR is defined as the signal power divided by the power of the conversion residue  $V_{res}$  at the end of the LSB comparison. When N is small, SNR is limited by  $E_q$ . As N increases,  $E_q$  decreases exponentially, and the SNR starts to be limited by  $E_{slope}$ . When N is very large, the SNR decreases because  $E_{slope}$  increases with N as indicated in Eq. (5.2). Shortening  $T_{SAR}$  can reduce  $E_{slope}$  and increase SNR, but it increases the design complexity and the power consumption. Finally, the technology eventually places a lower bound for  $T_{SAR}$ . For example, in the 40nm CMOS process,  $T_{SAR}$  is limited to about 200 ps. A more advanced process is required to further reduce  $T_{SAR}$ .

To further suppress  $E_{slope}$ , redundancy can be added in the SAR DAC [55–57]. The intuition is that the input signal variation can be considered as incorrect conversion results in the prior MSB decisions. Hence, as long as the added redundancy is larger than  $E_{slope}$  specified in Eq. (5.2), the DAC output can still catch up and track the time-varying input, as shown in Fig. 5.4(d).

As shown in Eq. (5.2),  $E_{slope}$  is larger for the MSB bits and smaller for the LSB bits. Thus, more redundancy should be allocated for the MSB bit (k = N), while less redundancy is needed for the LSB bit (k = 1). The optimal way to arrange the redundancy should follow Eq. (5.2). Thus, we can derive that the CDAC bit weight  $\{W_k\}$  should be assigned in the following way to support a Nyquist-rate input:

$$W_{k} = \begin{cases} 1, & \text{for } k = 1\\ \sum_{i=1}^{k-1} W_{i} - 2^{B-1} \cdot k \cdot T_{SAR} \cdot 2\pi \cdot f_{in,Nyq} & \text{for } k > 1 \end{cases}$$
(5.3)

where B represents the effective number of bit of the CDAC. To simplify real implementation, the actual weight value can be rounded to its nearest integer.

To visualize the benefit of redundancy, Fig. 5.6 plots the SNR degradation versus input signal frequency assuming  $T_{SAR} = 500$  ps, the ADC sampling rate  $f_s = 2$  MHz, and B = 7. The SNR degradation refers to the SNR difference between the CT SAR and its corresponding DT SAR. Without redundancy, there is appreciable SNR degradation as the input signal frequency increases. By contrast, with redundancy embedded, almost no SNR degra-



Figure 5.6: Comparison with SNR degradation with/without redundancy. dation is achieved over the entire Nyquist band, clearly demonstrating the benefit of adding redundancy.

This work uses both  $T_{SAR}$  minimization techniques and redundancy to reduce the CT SAR error, so that  $V_{res}$  can be made comparable to that of a DT SAR. As a result, this work obviates the need for the analog prediction filter or LC lattice filter, reducing chip area, power, and design complexity.

## 5.4.3 Inter-stage amplifier operation with a time-varying $V_{res}$

As shown in Fig. 5.3, another key difference between the proposed ADC and the conventional DT two-step ADC is that the inter-stage amplifier in this work observes a continuously varying residue rather than a fixed DC-like residue. Thus, one question may arise that whether the inter-stage amplifier can work properly with a time-varying  $V_{res}$ , especially when a dynamic amplifier (DA) is adopted to reduce power and noise.

To analyze the operation of a DA in the presence of a time-varying input, let us consider a simplified circuit model shown in Fig. 5.7(a). During the reset phase, the output  $C_L$  is short to a common-mode voltage. During the amplification phase, the DA works as a transconductor  $g_m$  that integrates the input onto  $C_L$  with the fixed time window  $T_{int}$ . Mathematically speaking, the DA output  $V_{out}$  is given by:

$$V_{out}[n] = \int_{nT_0}^{nT_0 + T_{int}} \frac{g_m \cdot V_{in}(t)}{C_L} \cdot dt$$
  
=  $\frac{g_m}{C_L} \cdot V_{in}(t) \circledast h(t)|_{t=nT_0 + T_{int}}$  (5.4)

where  $V_{out}(n)$  is the output after the *n*-th integration,  $T_0$  is the total time period consisting of both the reset and the integration phases [see Fig. 5.7(b)]. As shown in the right hand side of Eq. (5.4), this integration process is equivalent to a convolution with a window function h(t) [see Fig. 5.7(c)], and then sampled at  $t = nT_0$ . Thus, the overall transfer function of DA is equivalent to a sinc function:

$$H(\omega) = \frac{g_m \cdot T_{int}}{C_L} \cdot \operatorname{sinc}(\omega \cdot \frac{T_{int}}{2}) \cdot e^{j\omega \cdot \frac{T_{int}}{2}}$$
(5.5)

Fig. 5.7(d) plots an example magnitude response with  $T_{int} = 2.5$  ns and a nominal DC gain of 30 dB. Within the signal bandwidth of 1 MHz, the DA works like a normal amplifier with a fixed gain. At out-of-band high frequencies, the DA has a low-pass response due to its integration behavior. As a result, it can provide an inherent mild 1st-order anti-aliasing capability. This low-pass response also filters out the wide-band thermal noise before the DA, leading to the significantly reduced sampling kT/C noise from the 1st stage. Fig. 5.7(e) also shows a simplified time domain view of the DA with a ramp-like input. The DA is equivalent to amplify a sampled middle point (or the time-average point) of the input. From this viewpoint, we can see that the DA operation with a time-varying input can be mapped to a sampled input case.

## 5.5 Prototype ADC Implementation

Fig. 5.8(a) shows the schematic of the prototype 2-MS/s ADC designed in 40nm CMOS process. For simplicity, only the single-ended configuration is shown, but the real implementation is fully differential. In the first stage, the target effective number of bits B is chosen to be 7 to minimize the amplitude of  $V_{res}$  and relax the linearity requirement of the inter-stage amplifier. Its total number of comparisons N is set to 10 to allocate sufficient redundancy to tolerate input signal variation and capacitor mismatch. The unit capacitor Cis 1 fF. A bridge capacitor is used to implement lower LSB capacitors without using even smaller unit capacitors. The input capacitor and the total CDAC capacitor are set to 60 fF and 68 fF, respectively, to balance the tradeoff among the ADC input signal swing, the attenuation factor, the chip area, and the ADC input and reference buffer requirement. The raw matching of these small capacitors cannot meet the requirement of the target 13-bit resolution. Thus, a one-time foreground capacitor mismatch calibration is applied as in [46,58]. The size of the pseudo-resistor  $R_B$  is chosen such that the high-pass corner frequency is around 50 Hz.

Considering the 40nm process, design complexity, and power consumption,  $T_{SAR}$  is set to 500 ps. The dynamic logic used to shorten  $T_{SAR}$  is shown in Fig. 5.8(b) [48]. Fig. 5.8(c) shows the ADC timing diagram. The 1st-stage CT-SAR is triggered at the rising edge of the system clock  $CLK_{SYS}$  and runs asynchronously for 10 cycles. In total, the 1st-stage SAR takes about 5 ns to finish. Even though this is much faster than the Nyquist rate, its power increase is rather mild. In the 40-nm CMOS process, to meet this speed requirement, the transistor sizes for the logic gate and the comparator can still be kept small. As a result, the increase in the comparator power and SAR logic power are insignificant. The DA is triggered by the falling edge of  $CLK_{SYS}$ and it takes up 2.5 ns to achieve a gain of 32. Note that  $T_{SAR}$  may change due to process, voltage, and temperature (PVT) variation. If there is a large time difference between the end of 1st-stage SAR cycles and the DA start time (i.e., the falling edge of  $CLK_{SYS}$ , it would result in increased 1st-stage conversion residue amplitude. To address this issue,  $T_{SAR}$  is adjusted via tunable delay cells in the foreground to minimize this time difference.

With the aforementioned ADC configuration, Fig. 5.9 shows the simulated amplitude of the equivalent amplified  $V_{res}$  by the DA as a function of both the input signal frequency and amplitude. The prototype ADC can support an input signal frequency covering the whole Nyquist bandwidth with a peak-to-peak signal swing of 2.5V without appreciable increase in  $V_{res}$ . For a smaller input amplitude of 20% full swing, the proposed CT-SAR can tolerate up to  $2.5f_s$  of input signal frequency.  $V_{res}$  increases when both input amplitude and frequency are large, but this situation can be prevented by a low-pass anti-aliasing filter before the ADC. Ensuring a small  $V_{res}$  of around 20 mV peak-to-peak significantly relaxes the linearity requirement of the interstage amplifier, which permits the use of an open-loop DA to replace the conventional closed-loop static amplifier to reduce power and noise.

The schematic of the proposed floating inverter-based (FIB) DA is shown in Fig. 5.10(a). It uses a inverter-based CMOS input stage to double the transconductance compared to a conventional DA with a single NMOS or PMOS input pair [9, 49–51]. A cross-coupled inverter is inserted at the output to provide positive feedback to boost the DA voltage gain [59,60]. The amplifier is powered by a 3.2-pF battery capacitor  $C_B$ , which is re-charged to  $V_{\rm DD}$  and GND during the DA reset phase. The battery capacitor  $C_B$  isolates the DA operation from the power supplies  $V_{\rm DD}$  and GND during the amplification phase, and thus, provides stronger rejection to the input common-mode (CM) voltage variations. Moreover, unlike a conventional DA whose output CM voltage is typically not well controlled and sensitive to process, voltage, and temperature (PVT) variations, the proposed FIB DA ensures a constant output CM voltage, owing to the use of the floating battery capacitor. It inherently guarantees that the NMOS pair current must match the PMOS pair current, and thus, the output CM current has to be zero. This obviates the need for an explicit output CM feedback loop [61]. Having a stable output CM voltage allows a wide output signal swing and a large amplifier gain.  $C_N$  of 30 fF serves as the internal integration capacitor to reduce the DA bandwidth and its input referred noise. In a conventional DA, each NMOS or PMOS pair would have its own integration capacitor. By contrast, this work shares a single  $C_N$  between NMOS and PMOS pairs. This cuts down the capacitor size by 4 times (i.e., a single  $C_N$  rather than two  $2C_N$ s). In the reset phase,  $C_N$  is connected to  $V_{\rm DD}$  and GND.

The timing diagram and waveforms for key circuit nodes are plotted in Fig. 5.10(b). At the falling edge of  $CLK_{SYS}$ ,  $\phi_1$  goes high and the DA is turned on. The DA output voltages  $V_{o+}$  and  $V_{o-}$  start to depart from each other due to DA input integration. After certain time, the voltage across  $C_N$ becomes high enough to enable the cross-coupled inverter, which then leads to the exponential growth of  $V_{o+}$  and  $V_{o-}$  due to the positive feedback. Note that although an output differential mode voltage is developed, the output CM voltage remains nearly unchanged, which is ensured by using the battery capacitor  $C_B$  to power the DA, as explained earlier. As time goes by,  $C_B$  loses charge and its top-plate voltage  $V_{C_B}$  starts to drop below  $V_{DD}$ . As a result, the transistor  $M_T$  starts to turn on and charge up  $C_T$ . Once the  $C_T$  voltage  $V_{C_T}$  reaches the logical threshold of the NOR gate,  $\phi_1$  goes low, which ends the amplification phase. Note that the DA amplification time depends on how fast  $M_T$  charges  $C_T$ , and thus, can change with PVT variations. In this work, to keep the DA gain constant, the DA time is off-chip foreground calibrated by tuning the back-gate voltage of  $M_T$  using  $V_{\text{TIMER}}$  as in [59]. Background calibration can also be done as in [60].

Unlike the simple DA in Section 5.4.3 that always works in the linear integration phase, the proposed FIB DA has two operation phases: a linear integration phase and a positive-feedback regeneration phase. The time of the integration phase and the DA bandwidth are primarily set by the value of the integration capacitor  $C_N$ . The total time that the DA is on is set by  $C_B$ ,  $M_T$ , and  $C_T$ . Adjusting them mainly changes the time that the DA spends in the regeneration phase. More detailed discussions can be also found in [60, 62].

Also different from the simple DA in Section 5.4.3, the corresponding h(t) of the proposed FIB DA is not constant but time varying, as shown in Fig. 5.11(a). Nevertheless, the slight time dependence of h(t) only mildly changes the equivalent transfer function of the proposed DA. Comparing to Fig. 5.11(b) and Fig. 5.7(d), the only difference is the removal of deep notches. The in-band flatness and the out-of-band -20 dB/dec-low-pass behavior are maintained.

### 5.6 Noise Analysis of the Proposed ADC

The noise models for the conventional DT-SAR ADC and the proposed CT-SAR ADC are shown in Fig. 5.12(a) and (b), respectively. For simplicity of purpose, we only consider the 1st-stage sampling noise  $e_{sam1}$ , the interstage amplifier noise  $e_{amp}$ , and the 2nd-stage noise  $e_2$  (including the 2nd-stage sampling noise, the quantization noise, and the comparator noise).

For the DT-SAR ADC of Fig. 5.12(a), the total input referred noise  $e_{tot,DT}$  can be derived as:

$$e_{tot,DT} = e_{sam1,DT} + e_{amp} + \frac{e_2}{G_{DT}}$$
$$= \frac{kT}{C_{DAC}} + e_{amp} + \frac{e_2}{G_{DT}}$$
(5.6)

where  $G_{DT}$  represents the inter-stage gain in the DT case,  $C_{DAC}$  represents the total CDAC capacitance.

For the CT-SAR ADC of Fig. 5.12(b), the total input-referred noise  $e_{tot,CT}$  can be derived as:

$$e_{tot,CT} = e_{sam1,CT} + \frac{C_T}{C_{DAC}} \cdot \left(e_{amp} + \frac{e_2}{G_{CT}}\right)$$
$$= 4kTR_{eq} \cdot BW_{DA} + \frac{C_T}{C_{DAC}} \cdot \left(e_{amp} + \frac{e_2}{G_{CT}}\right)$$
(5.7)

where  $C_T$  represents the total capacitance at the 1st-stage comparator input node,  $G_{CT}$  represents the inter-stage gain in the CT case,  $BW_{DA}$  represents the DA bandwidth, and  $R_{eq}$  represents the equivalent resistance of the 1st-stage CT input sampling network.

For a fair comparison, we assume that both the CT and DT ADCs have the same nominal resolution (i.e., the quantization noise). Thus, it is easy to derive that  $G_{DT} = G_{CT} \cdot \frac{C_T}{C_{DAC}}$ . Plugging it in (5.7) and rearranging (5.7), we have:

$$e_{tot,CT} = \frac{BW_{DA}}{1/(R_{eq}C_{DAC})} \cdot \frac{kT}{C_{DAC}} + \frac{C_T}{C_{DAC}} \cdot e_{amp} + \frac{e_2}{G_{DT}}$$
(5.8)

Comparing (5.6) and (5.8), there are two differences. First, the 1ststage sampling noise is greatly attenuated in the proposed CT-SAR topology. In the conventional DT-SAR ADC, with the noise power spectral density (PSD) being  $4kTR_{eq}$  and the noise BW being  $1/(R_{eq}C_{DAC})$ , the 1st-stage sampling noise is  $kT/C_{DAC}$ . By contrast, in the CT-SAR ADC, the sampling noise PSD and BW are de-coupled. The effective noise BW is not  $1/(R_{eq}C_{DAC})$ , but  $BW_{DA}$ . The sampling noise reduction ratio is  $BW_{DA}/[1/(R_{eq}C_{DAC})]$ , which is the ratio of the DA BW over the 1st-stage sampling network BW. By having a small  $C_{DAC}$  (e.g., 60 fF) and a small  $R_{eq}$ , the 1st-stage sampling network BW can be made much larger than the DA BW, leading to the 1st-stage sampling noise much smaller than  $kT/C_{DAC}$ . Second, the CT operation with the capacitive input network comes with a penalty of input signal attenuation, which results in the increased inter-stage amplifier noise by the factor of  $C_T/C_{DAC}$ . To avoid high power consumption of the inter-stage amplifier, the overall ADC noise budgeting is optimized in this design. Since the 1st-stage sampling noise is reduced, a larger portion of noise budget can be assigned to the inter-stage amplifier, which lowers its power consumption. Overall, the proposed CT-SAR technique can be used to reduce the 1st-stage capacitor sizes and the core ADC area without causing large sampling noise penalty. Moreover, with the significantly reduced 1st-stage capacitor sizes, the performance requirements of the

|                    | Simulation            | Measurement           |
|--------------------|-----------------------|-----------------------|
|                    | results               | results               |
| Quantization noise | 76.8 $\mu V$          | N/A                   |
| DA noise           | $171.1 \ \mu V$       | N/A                   |
| 1st-stage noise    | $82.8 \ \mu V$        | N/A                   |
| 2nd-stage noise    | $42.5 \ \mu V$        | N/A                   |
| Total noise        | $209.4~\mu\mathrm{V}$ | $215.7~\mu\mathrm{V}$ |

Table 5.1: Noise Budgeting of the Proposed CT-SAR ADC

ADC driver and the reference buffer can be relaxed, which can lead to power saving on the system level.

Table 5.1 shows the input referred noise breakdown of the prototype ADC. The measurement result matches well with the simulation result.

#### 5.7 Measurement Results

Fig. 5.13 shows the die photo of the prototype ADC in a 40-nm LP-CMOS process. The ADC core area is 0.01 mm<sup>2</sup>. The supply voltage used by the 1st-stage CT SAR and the DA is set to 1.1 V to enhance the operation speed and support a wide input signal swing. The 2nd-stage DT SAR uses a supply voltage of 0.7 V to reduce the power consumption. At the sampling rate of 2 MS/s, the ADC consumes in total 25.2  $\mu$ W of power, where the 1st-stage CT SAR, the DA, and the 2nd-stage DT SAR consume 7.1  $\mu$ W, 12.2  $\mu$ W and 5.9  $\mu$ W, respectively. Inside the 1st-stage CT SAR, the comparator, the digital circuits, and the DAC consume 2.4  $\mu$ W, 3  $\mu$ W and 1.7  $\mu$ W, respectively. Inside the 2nd-stage DT SAR, the digital circuits, and the DAC consume 1.9  $\mu$ W, 1.9  $\mu$ W and 2.1  $\mu$ W, respectively. Fig. 5.14(a) and (b) show the measured spectrum with a low-frequency signal and a near Nyquist-rate input signal, respectively. With a full-swing input at 100 kHz, the measured SNDR and SFDR are 73.5 and 87.8 dB, respectively. With a 0-dBFS near Nyquist-rate input of 950 kHz, the measured SNDR and SFDR are 71.7 and 80.1 dB, respectively. As mentioned in Section 5.5, foreground calibration is performed for both the 1st- and the 2nd-stage capacitor mismatches, without which the SNDR and SFDR would be limited to 60 dB and 65.4 dB, respectively. The gain and the offset of the DA, and the offset of the 1st-stage comparator are also calibrated. In addition, the 1ststage SAR logic delay is calibrated to minimize the time difference between the end of 1st-stage SAR cycles and the DA starting edge.

Fig. 5.15 shows the measured SNDR and SFDR versus the input frequency. Fig. 5.16 shows the input amplitude sweep. The measured dynamic range is 73.6 dB. These measurement results show that the prototype works properly as a Nyquist-rate ADC over various input amplitude and frequency settings.

### 5.8 Comparison to Other LNA works

Table I provides the performance summary and compares it with other state-of-the-arts. The input capacitor of this work is orders of magnitude smaller than others with similar SNDR, which is made possible by the CT front-end with sampling noise suppression. Its chip area of 0.01 mm<sup>2</sup> is also significantly smaller than others, due to the reduced capacitor size. The Walden and Schreier figure-of-merits (FoM) with the Nyquist frequency input are 3.9 fJ/conversion-step and 177.8 dB, respectively, and are in line with the state-of-the-arts.

## 5.9 Summary

This chapter presented a two-step ADC architecture with a 1st-stage CT SAR. By removing the S/H circuit, the proposed ADC breaks the seemingly fundamental tradeoff between the input capacitor size and the sampling noise. The CT SAR conversion error is minimized by adding redundancy and accelerating the SAR speed. With the substantial reduction in the input capacitor size, it is envisioned that the power, area, and design complexity of the ADC driver and the reference buffer can be relaxed, leading to significant benefits on the system level.



Figure 5.7: (a) DA schematic; (b) timing diagram; (c) window function h(t); (d) DA response  $H(\omega)$ ; (e) equivalent time-domain amplified signal point.



Figure 5.8: Proposed CT-SAR-assisted two-step ADC: (a) top-level schematic; (b) dynamic SAR logic; (c) timing diagram.



Figure 5.9: Maximum residue voltage  $\max(V_{res})$  as a function of input signal frequency and amplitude.



Figure 5.10: Proposed floating inverter based (FIB) dynamic amplifier (DA): (a) schematic; (b) timing diagram and waveforms for key circuit nodes.



Figure 5.11: Proposed FIB-DA: (a) simulated  $g_m(t)$ ; (b) frequency response.



Figure 5.12: Noise linear model for (a) conventional DT-SAR ADC; (b) proposed CT-SAR ADC.



Figure 5.13: Chip photo.



Figure 5.14: Measured spectrum with (a) 100-kHz and (b) 950-kHz input.



Figure 5.15: Measured SNDR and SFDR versus input frequency.



Figure 5.16: Measured SNDR and SFDR versus input amplitude.

| $\times$ $\cdot$                                                                                                                                                                                                                                                                                                                                                                                                                       | 177.8           | 160         | 148         | 174.9    | 170      | 173.4 | 176.3 | FoM <sub>s</sub> (dB)                            |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|-------------|-------------|----------|----------|-------|-------|--------------------------------------------------|
| $\times$ $\cdot$ 163.24ResistiveResistiveResistive0.120.3210.0541.95.10.0140nm180nm65nm180nm28nm40nm LP15151311NA130.0210502690002182.52.43.6NA2.595.195.484.6677379.474.176.870.961.16871.71212.511.59.91111.61.175100100026700233000025.2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 3.9             | 495         | 1113        | 6.9      | 45       | 14.1  | 4.4   | FoM <sub>w</sub> (fJ/step)                       |
| $\mathbf{X}$ $\mathbf{Y}$ $Y$ | 25.2            | 2330000     | 26700       | 1000     | 5100     | 1.17  | 0.352 | Total Power (uW)                                 |
| X         · poince         ·                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 11.6            | 11          | 9.9         | 11.5     | 12.5     | 12    | 11.3  | ENOB <sub>NYQ</sub> (bit)                        |
| (-) $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ $(-)$ <th< th=""><th>71.7</th><th>68</th><th>61.1</th><th>70.9</th><th>76.8</th><th>74.1</th><th>69.7</th><th>SNDR<sub>NYQ</sub> (dB)</th></th<>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 71.7            | 68          | 61.1        | 70.9     | 76.8     | 74.1  | 69.7  | SNDR <sub>NYQ</sub> (dB)                         |
| $\times$ $\cdot$                                                                                                                                                                                                                                                                                                                                                                                                                       | 79.4            | 73          | 67          | 84.6     | 95.4     | 95.1  | 78.5  | SFDR <sub>NYQ</sub> (dB)                         |
| X         I point         I po                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 2.5             | N/A         | 3.6         | 2.4      | 2.5      | 1.8   | 1.6   | Input Swing (V <sub>pp,diff</sub> )              |
| X         I point         I point <thi point<="" th=""> <thi point<="" th=""> <thi point<="" th=""><th>2</th><th>9000</th><th>26</th><th>50</th><th>10</th><th>0.02</th><th>0.032</th><th>f<sub>s</sub> (MS/s)</th></thi></thi></thi>                                                                                                                                                                                                                                                                                                                          | 2               | 9000        | 26          | 50       | 10       | 0.02  | 0.032 | f <sub>s</sub> (MS/s)                            |
| Ann         Annow         Annon         Annow         Annow         A                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 13              | N/A         | 11          | 13       | 15       | 15    | 14    | Resolution (bits)                                |
| X         I point         I point <thi point<="" th=""> <thi poin<="" th=""><th>40nm LP</th><th>28nm</th><th>180nm</th><th>65nm</th><th>180nm</th><th>40nm</th><th>40nm</th><th>Technology</th></thi></thi>                                                                                                                                                                                                                                                                                                                                    | 40nm LP         | 28nm        | 180nm       | 65nm     | 180nm    | 40nm  | 40nm  | Technology                                       |
| Mathematical Control     Operation     Operation     Operation     Operation     Operation       X     X     X     X     Y     Y     Y       16     3.2     4     Resistive     Resistive     0.12                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 0.01            | 5.1         | 1.9         | 0.054    | -1       | 0.32  | 0.18  | Area (mm <sup>2</sup> )                          |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 0.12            | Resistive   | Resistive   | 4        | 3.2      | 16    | 9     | 1 <sup>st</sup> -stage input<br>capacitance (pF) |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | <               | <b>~</b>    | <b>~</b>    | ×        | ×        | ×     | ×     | kT/C noise attenuated                            |
| SAR   Pineline   Pine-SAR   CT nineline   CT nineline   CT two-sten SAR                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | CT two-step SAR | CT pipeline | CT pipeline | Pipe-SAR | Pipeline | SAR   | SAR   | Architecture                                     |
| [4] [5] [6] [14] [15] <b>This work</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | This work       | [15]        | [14]        | [6]      | [5]      | [4]   | [1]   | Specifications                                   |

Table 5.2: Performance Summary and Comparison with State-of-the-art ADCs

# Chapter 6

# **Conclusion and Future Directions**

## 6.1 Conclusion

This thesis mainly presents two techniques to improve the performance of front-end low-noise amplifiers and SAR ADCs. One is the inverter-stacking technique, which utilized the un-used headroom and can be used to improve the current use efficiency and the overall power efficiency of the LNAs. The other is kT/C noise attenuation technique, which can effectively de-couple the sampling noise and the sampling capacitor size. Thus, it can significantly improve both the power- and area- efficiency. To validate the effectiveness of these two techniques, three chips have been fabricated.

Chapter 3 discusses the inverter-stacking technique for the LNA design. This chapter presented a novel power-efficient inverter-stacking amplifier. It achieves 6-time current reuse under 1 V supply and obtains the best NEF among all reported amplifiers to authors' best knowledge. By splitting feedback capacitors, the required input AC coupling is realized without extra hardware cost. A simple replica-based biasing circuit is devised that ensures the robust operation across PVT variations. It is well suited to be used as the front-end amplifier for various applications that have stringent power or energy requirement, such as biomedical implants and wireless sensors.

Chapter 4 discusses the implementation of a better design for the inverter-stacking amplifier design. This chapter presented a novel low-noise instrumentation amplifier with high power efficiency. The measurement results demonstrate the feasibility of operating OTAs with a high power efficiency under a globally low supply voltage. The design trade-offs between supply voltage, power, noise, gain, linearity, and output swing are carefully addressed with chopping, tail-less inverter-stacking input stage, dominant-pole compensated three-stage structure, and high-swing class-AB output stage. Moreover, CM-pre-filtering biasing loop, and the high-impedance source degeneration naturally coming with the inverter-stacking structure enhance the CMRR performance. In all, the proposed LNCIA shows consistent, better-than-bipolar PEF performance across a wide temperature and supply voltage range. It can be a perfect fit to the various applications where low-supply and high power efficiency are desired.

Chapter 5 introduces a novel kT/C noise reduction technique for SAR ADC. This chapter presented a two-step ADC architecture with a 1st-stage CT SAR. By removing the S/H circuit, the proposed ADC breaks the seemingly fundamental tradeoff between the input capacitor size and the sampling noise. The CT SAR conversion error is minimized by adding redundancy and accelerating the SAR speed. With the substantial reduction in the input capacitor size, it is envisioned that the power, area, and design complexity of the ADC driver and the reference buffer can be relaxed, leading to significant benefits on the system level.

## 6.2 Future Directions

There are three main future directions following this these: refining of the techniques themselves, applying the core ideas to other circuits and systems, and expanding the design scope.

The first direction is refine the techniques to further improve the blocklevel power efficiency. The proposed inverter-stacking technique improves the power and current efficiency for LNAs. Nevertheless, there are several other properties (e.g., high output impedance), which should also be taken into consideration when applied into a real products or in real-life measurement. The proposed CT-SAR-assisted kT/C noise reduction technique effectively decouples the sampling noise and the sampling capacitor size, which allows the use of small sampling capacitance for high-resolution Nyquist design. The prototype has demonstrate the effectiveness of the proposed techniques. However, there is still a large space to further improve the design. For instance, the use of 1st-stage high speed CT-SAR ultimately limits the bandwidth. It is desired to look for the optimal parameter design or even better ADC architecture to extend the bandwidth.

The second direction is to apply the core ideas to some other circuits and systems. There are several other attractive properties of both inverterstacking technique and CT-SAR ADC. For the inverter-stacking technique, the core idea can be concluded as more current reuse. Now the reuse happens only within the block, however, one of the possible extension can be current reuse among different LNAs in different channels or among different blocks. The potential benefit can be higher area efficiency and system-level current efficiency. For the CT-SAR ADC, one of the benefit that was not fully explored is that it delayed the sampling instance till the LSB cycle. This means the signal tracking is almost real-time, and leads to a much-smaller delay. This can be useful when it is applied as a quantizer in CT DSM, which could potential avoid the use of ELD compensation, leading to much easier circuit architecture and less design effort.

The last direction is to expand the design scope. It is always exciting to design a system for real-life measurement. Therefore, we should not only focus on the core building blocks. Instead, the co-design of the core blocks and their peripheral circuits can lead to higher-level power and area efficiency. For instance, we need to investigate the development of the ADC input buffers and the reference buffers, together with the ADC, since every ADC could have different requirements on the buffers, including the concerns on bandwidth, duty cycle, etc. Appendices

# Appendix A

# List of Publications

## A.1 Patent

 Linxiao Shen and Nan Sun, "Inverter stacking amplifier" US application number 62/514,684 (pending).

## A.2 Conference Papers

- Linxiao Shen, Yi Shen, Xiyuan Tang, Chen-Kai Hsu, Wei Shi, Shaolan Li, Wenda Zhao, and Nan Sun, A 0.01mm2 25uW 2MS/s 74dB-SNDR Continuous-Time Pipelined-SAR ADC with 120fF Input Capacitor, IEEE international Solid-State Circuits Conference (ISSCC), 2019.
- Linxiao Shen, Abhishek Mukherjee, Shaolan Li, Xiyuan Tang, Nanshu Lu, and Nan Sun, A 0.6-V Tail-Less Inverter Stacking Amplifier with 0.96 PEF, IEEE Symposium on VLSI Circuits (VLSI), June 2019 (accepted).
- Linxiao Shen, Nanshu Lu, and Nan Sun, A 1V 0.25uW Inverter-Stacking Amplifier with 1.07 Noise Efficiency Factor, IEEE Symposium on VLSI Circuits (VLSI), pp. C140-C141, 2017.
- 4. Xiyuan Tang, Shaolan Li, Linxiao Shen, Wenda Zhao, Xiangxing Yang,

Randy Williams, Jiaxin Liu, Zhichao Tan, Neal Hall, Nan Sun, A 16fJ/conversionstep Time-Domain Incremental Zoom Capacitance-to-Digital Converter, IEEE international Solid-State Circuits Conference (ISSCC), 2019.

- Xiyuan Tang, Begum Kasap, Linxiao Shen, Xiangxing Yang, Wei Shi, and Nan Sun, An Energy-Efficient Comparator with Dynamic Floating Inverter Pre-Amplifier, IEEE Symposium on VLSI Circuits (VLSI), June 2019 (accepted).
- 6. Biying Xu, Yibo Lin, Xiyuan Tang, Shaolan Li, Linxiao Shen, Nan Sun and David Z. Pan, WellGAN: Generative-Adversarial-Network-Guided Well Generation for Analog/Mixed-Signal Circuit Layout, ACM/IEEE Design Automation Conference (DAC), Las Vegas, NV, Jun. 2-6, 2019. (accepted).
- Biying Xu, Shaolan Li, Chak-Wa Pui, Derong Liu, Linxiao Shen, Yobo Lin, Nan Sun and David Z. Pan, Device Layer-Aware Analytical Placement for Analog Circuits, accepted to IEEE International Symposium on Physical Design (ISPD), 2019.
- 8. Shaolan Li, Wenda Zhao, Biying Xu, Xiangxing Yang, Xiyuan Tang, Linxiao Shen, Nanshu Lu, David Pan and Nan Sun, A 0.025-mm2 0.8-V 78.5dB-SNDR VCO-based Sensor Readout Circuit in a Hybrid PLL-M Structure, accepted to IEEE Custom Integrated Circuits Conference (CICC), April 2019, accepted.

- Xiyuan Tang, Yi Shen, Linxiao Shen, Wenda Zhao, Zhangming Zhu, Visvesh Sathe and Nan Sun, A 10b 120MS/s SAR ADC with Reference Ripple Cancellation Technique, IEEE Custom Integrated Circuits Conference (CICC), April 2019, accepted.
- Yi Zhong, Shaolan Li, Arindam Sanyal, Xiyuan Tang, Linxiao Shen, Siliang Wu, and Nan Sun, "A Second-Order Purely VCO-Based CT ΔΣ ADC using a Modified DPLL in 40nm CMOS," 2018 IEEE Asian Solid-State Circuits Conference (A-SSCC), Tainan, 2018, pp. 93-94.
- Hyoyoung Jeong, Taewoo Ha, Irene Kuang, Linxiao Shen, Zhaohe Dai, Nan Sun, Nanshu Lu, NFC-Enabled, Tattoo-Like Stretchable Biosensor Manufactured by Cut-and-Paste Method, 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2017.

## A.3 Journal Papers

- Linxiao Shen, Yi Shen, Zhelu Li, Wei Shi, Xiyuan Tang, Shaolan Li, Wenda Zhao, Mantian Zhang, Zhangming Zhu, and Nan Sun, A Two-Step ADC with a Continuous-Time SAR Based First Stage, IEEE Journal of Solid-State Circuits (ISSCC invited), accepted.
- Linxiao Shen, Nanshu Lu, and Nan Sun, A 1-V 0.25-W Inverter Stacking Amplifier with 1.07 Noise Efficiency Factor, IEEE Journal of Solid-State Circuits, vol. 53, no. 3, pp. 896-905, Mar. 2018.
- 3. Hyoyoung Jeong, Liu Wang, Taewoo Ha, Ruchika Mitbander, Xiangxing Yang, Zhaohe Dai, Shutao Qiao, Linxiao Shen, Nan Sun, Nanshu Lu, Modular and Reconfigurable Wireless E-Tatoos for Personalized Sensing, Advanced Materials Technologies, accepted.
- 4. Abhishek Mukherjee, Miguel Gandara, Biying Xu, Shaolan Li, Linxiao Shen, Xiyuan Tang, David Pan, and Nan Sun, A 1 GS/s 20 MHz-BW Capacitive-Input Continuous Time ADC Using a Novel Parasitic Pole-Mitigated Fully Differential VCO, IEEE Solid-State Circuits Letters, accepted.

#### A.4 Journal Papers Under Review

- Xiyuan Tang, Linxiao Shen, Begum Kasap, Xiangxing Yang, Wei Shi, Abhishek Mukherjee, David Pan, and Nan Sun, "An Energy-Efficient Comparator with Dynamic Floating Inverter Pre-Amplifier", submitted to IEEE Journal of Solid-State Circuits.
- 2. Yi Shen, Xiyuan Tang, Linxiao Shen, Wenda Zhao, Xin Xin, Shubin Liu, Zhangming Zhu, Visvesh S. Sathe, and Nan Sun, "A 10-bit 120-MS/s SAR ADC with Reference Ripple Cancellation Technique", submitted to IEEE Journal of Solid-State Circuits.
- 3. Yi Zhong, Shaolan Li, Xiyuan Tang, **Linxiao Shen**, Wenda Zhao, Siliang Wu, Nan Sun, "A Second-Order Purely VCO-Based CT  $\Delta\Sigma$  ADC

Using a Modified DPLL Structure in 40-nm CMOS", submitted to IEEE Journal of Solid-State Circuits.

 Wenda Zhao, Shaolan Li, Biying Xu, Xiangxing Yang, Xiyuan Tang, Linxiao Shen, Nanshu Lu, David Z. Pan, and Nan Sun, "A 0.025-mm<sup>2</sup> 0.8-V 78.5-dB SNDR VCO-based Sensor Readout Circuit in a Hybrid PLL-M Structure", submitted to IEEE Journal of Solid-State Circuits.

### A.5 Journal Papers In preparations

- Linxiao Shen, Abhishek Mukherjee, Nanshu Lu, and Nan Sun, "A 0.6-V Tail-Less Inverter Stacking Amplifier with 0.96 PEF and Chopping," to be submitted to IEEE Journal of Solid-State Circuits (JSSC).
- 2. Linxiao Shen, Xiyuan Tang, Wei Shi, and Nan Sun, "A Reset-Free Dynamic Comparator," to be submitted to IEEE Solid-State Circuits Letter (SSC-L).

## Appendix B

# My Appendix #2

B.1 The First Section

|    | VLSI I                               | Fall 2014   | UT-Austin | Dr. David Pan         | A   |
|----|--------------------------------------|-------------|-----------|-----------------------|-----|
|    | Analog Integrated Circuit Design     | Fall 2014   | UT-Austin | Dr. Nan Sun           | A   |
| ہے | adio Freq Integrated Circuit Design  | Spring 2015 | UT-Austin | Dr. Ranjit Gharpurey  | A   |
|    | Data Converter                       | Spring 2015 | UT-Austin | Dr. Nan Sun           | -A- |
|    | Biomedical-Instrumentation-I         | Fall 2015   | UT-Austin | Dr. John Pearse       | -A- |
|    | ALSI II                              | Spring 2016 | UT-Austin | Dr. Mark McDermott    | A   |
|    | High-speed Comp Arithmetic           | Spring 2017 | UT-Austin | Dr. Earl Swartzlander | A   |
|    | alog Filters/Oversampling Converters | Spring 2019 | UT-Austin | Dr. Ramin Zanbaghi    | A   |
|    | Suppor                               | rting Work  |           |                       |     |
|    | Nanoscale devices and technology     | Fall 2015   | UT-Austin | Dr. Jack Lee          | -A- |
|    | ower Electronics Devices and System  | Fall $2016$ | UT-Austin | Dr. Mark Flynn        | A   |

| 1 | 9 | n |
|---|---|---|
| T | Э | U |

## Bibliography

- Youngcheol Chae. Energy-Efficient Inverter-Based Amplifiers, pages 297–314. Springer International Publishing, Cham, 2019.
- [2] Wikipedia contributors. Internet of things Wikipedia, the free encyclopedia, 2019. [Online; accessed 27-August-2019].
- [3] P. Harpe, E. Cantatore, and A. van Roermund. 11.1 An oversampled 12/14b SAR ADC with noise reduction and linearity enhancements achieving up to 79.1dB SNDR. In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pages 194–195, Feb 2014.
- [4] B. Verbruggen, J. Tsouhlarakis, T. Yamamoto, M. Iriguchi, E. Martens, and J. Craninckx. A 60 dB SNDR 35 MS/s SAR ADC With Comparator-Noise-Based Stochastic Residue Estimation. *IEEE Journal of Solid-State Circuits*, 50(9):2002–2011, Sep. 2015.
- [5] L. Chen, X. Tang, A. Sanyal, Y. Yoon, J. Cong, and N. Sun. A 0.7-V 0.6μW 100-kS/s Low-Power SAR ADC With Statistical Estimation-Based Noise Reduction. *IEEE Journal of Solid-State Circuits*, 52(5):1388–1398, May 2017.

- [6] M. Shim, S. Jeong, P. D. Myers, S. Bang, J. Shen, C. Kim, D. Sylvester, D. Blaauw, and W. Jung. Edge-Pursuit Comparator: An Energy-Scalable Oscillator Collapse-Based Comparator With Application in a 74.1 dB SNDR and 20 kS/s 15 b SAR ADC. *IEEE Journal of Solid-State Circuits*, 52(4):1077–1090, April 2017.
- [7] B. Hershberg, S. Weaver, K. Sobue, S. Takeuchi, K. Hamashita, and U. Moon. Ring amplifiers for switched-capacitor circuits. In 2012 IEEE International Solid-State Circuits Conference, pages 460–462, Feb 2012.
- [8] Y. Lim and M. P. Flynn. A 1 mW 71.5 dB SNDR 50 MS/s 13 bit Fully Differential Ring Amplifier Based SAR-Assisted Pipeline ADC. *IEEE Journal of Solid-State Circuits*, 50(12):2901–2911, Dec 2015.
- [9] H. Huang, H. Xu, B. Elies, and Y. Chiu. A Non-Interleaved 12-b 330-MS/s Pipelined-SAR ADC With PVT-Stabilized Dynamic Amplifier Achieving Sub-1-dB SNDR Variation. *IEEE Journal of Solid-State Circuits*, 52(12):3235–3247, Dec 2017.
- [10] H. Li, M. Maddox, M. C. W. Coin, W. Buckley, D. Hummerston, and N. Naeem. A signal-independent background-calibrating 20b 1MS/S SAR ADC with 0.3ppm INL. In 2018 IEEE International Solid - State Circuits Conference - (ISSCC), pages 242–244, Feb 2018.
- [11] B. Verbruggen, K. Deguchi, B. Malki, and J. Craninckx. A 70 dB SNDR 200 MS/s 2.3 mW dynamic pipelined SAR ADC in 28nm digital CMOS.

In 2014 Symposium on VLSI Circuits Digest of Technical Papers, pages 1–2, June 2014.

- [12] M. Ding, P. Harpe, Y. Liu, B. Busze, K. Philips, and H. de Groot. 26.2 A 5.5fJ/conv-step 6.4MS/S 13b SAR ADC utilizing a redundancy-facilitated background error-detection-and-correction scheme. In 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, pages 1–3, Feb 2015.
- [13] Y. Shu and B. Song. A 15-bit Linear 20-MS/s Pipelined ADC Digitally Calibrated With Signal-Dependent Dithering. *IEEE Journal of Solid-State Circuits*, 43(2):342–350, Feb 2008.
- [14] Y. Zhou, B. Xu, and Y. Chiu. A 12 bit 160 MS/s Two-Step SAR ADC With Background Bit-Weight Calibration Using a Time-Domain Proximity Detector. *IEEE Journal of Solid-State Circuits*, 50(4):920–931, April 2015.
- [15] A. Bannon, C. P. Hurrell, D. Hummerston, and C. Lyden. An 18 b 5 MS/s SAR ADC with 100.2 dB dynamic range. In 2014 Symposium on VLSI Circuits Digest of Technical Papers, pages 1–2, June 2014.
- [16] R. F. Yazicioglu, P. Merken, R. Puers, and C. Van Hoof. A 60/spl mu/W 60 nV/Hz Readout Front-End for Portable Biopotential Acquisition Systems. In 2006 IEEE International Solid State Circuits Conference -Digest of Technical Papers, pages 109–118, Feb 2006.

- [17] N. Verma, A. Shoeb, J. Bohorquez, J. Dawson, J. Guttag, and A. P. Chandrakasan. A micro-power eeg acquisition soc with integrated feature extraction processor for a chronic seizure detection system. *IEEE Journal of Solid-State Circuits*, 45(4):804–816, April 2010.
- [18] W. Chen, H. Chiueh, T. Chen, C. Ho, C. Jeng, M. Ker, C. Lin, Y. Huang,
  C. Chou, T. Fan, M. Cheng, Y. Hsin, S. Liang, Y. Wang, F. Shaw,
  Y. Huang, C. Yang, and C. Wu. A fully integrated 8-channel closed-loop neural-prosthetic cmos soc for real-time epileptic seizure control. *IEEE Journal of Solid-State Circuits*, 49(1):232–247, Jan 2014.
- [19] T. Wang, M. Lai, C. M. Twigg, and S. Peng. A fully reconfigurable low-noise biopotential sensing amplifier with 1.96 noise efficiency factor. *IEEE Transactions on Biomedical Circuits and Systems*, 8(3):411–422, June 2014.
- [20] R. F. Yazicioglu, S. Kim, T. Torfs, H. Kim, and C. Van Hoof. A 30μW Analog Signal Processor ASIC for Portable Biopotential Signal Monitoring. *IEEE Journal of Solid-State Circuits*, 46(1):209–223, Jan 2011.
- [21] S. Chang, K. AlAshmouny, M. McCormick, Y. Chen, and E. Yoon. Biobolt: A minimally-invasive neural interface for wireless epidural recording by intra-skin communication. In 2011 Symposium on VLSI Circuits - Digest of Technical Papers, pages 146–147, June 2011.
- [22] H. Bhamra, Y. Kim, J. Joseph, J. Lynch, O. Z. Gall, H. Mei, C. Meng,
   J. Tsai, and P. Irazoqui. A24 μW, batteryless, crystal-free, multinode

synchronized soc bionode for wireless prosthesis control. *IEEE Journal* of Solid-State Circuits, 50(11):2714–2727, Nov 2015.

- [23] H. Jeong, T. Ha, I. Kuang, L. Shen, Z. Dai, N. Sun, and N. Lu. NFCenabled, tattoo-like stretchable biosensor manufactured by cut-and-paste method. In 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 4094–4097, July 2017.
- [24] R. R. Harrison, P. T. Watkins, R. J. Kier, R. O. Lovejoy, D. J. Black,
  B. Greger, and F. Solzbacher. A low-power integrated circuit for a wireless 100-electrode neural recording system. *IEEE Journal of Solid-State Circuits*, 42(1):123–133, Jan 2007.
- [25] Hyoyoung Jeong, Liu Wang, Taewoo Ha, Ruchika Mitbander, Xiangxing Yang, Zhaohe Dai, Shutao Qiao, Linxiao Shen, Nan Sun, and Nanshu Lu. Modular and reconfigurable wireless e-tattoos for personalized sensing. Advanced Materials Technologies, 4(8):1900117, 2019.
- [26] F. Zhang, J. Holleman, and B. P. Otis. Design of ultra-low power biopotential amplifiers for biosignal acquisition applications. *IEEE Transactions on Biomedical Circuits and Systems*, 6(4):344–355, Aug 2012.
- [27] D. B. Shire\*, S. K. Kelly, J. Chen, P. Doyle, M. D. Gingerich, S. F. Cogan, W. A. Drohan, O. Mendoza, L. Theogarajan, J. L. Wyatt, and J. F. Rizzo. Development and implantation of a minimally invasive wireless

subretinal neurostimulator. *IEEE Transactions on Biomedical Engineer*ing, 56(10):2502–2511, Oct 2009.

- [28] T. Denison, K. Consoer, W. Santa, A. Avestruz, J. Cooley, and A. Kelly. A 2 μW 100 nV/rtHz Chopper-Stabilized Instrumentation Amplifier for Chronic Measurement of Neural Field Potentials. *IEEE Journal of Solid-State Circuits*, 42(12):2934–2945, Dec 2007.
- [29] S. Song, M. Rooijakkers, P. Harpe, C. Rabotti, M. Mischi, A. H. M. van Roermund, and E. Cantatore. A Low-Voltage Chopper-Stabilized Amplifier for Fetal ECG Monitoring With a 1.41 Power Efficiency Factor. *IEEE Transactions on Biomedical Circuits and Systems*, 9(2):237–247, April 2015.
- [30] R. Muller, H. Le, W. Li, P. Ledochowitsch, S. Gambini, T. Bjorninen, A. Koralek, J. M. Carmena, M. M. Maharbiz, E. Alon, and J. M. Rabaey. A minimally invasive 64-channel wireless ecog implant. *IEEE Journal of Solid-State Circuits*, 50(1):344–359, Jan 2015.
- [31] B. Johnson and A. Molnar. An Orthogonal Current-Reuse Amplifier for Multi-Channel Sensing. *IEEE Journal of Solid-State Circuits*, 48(6):1487– 1496, June 2013.
- [32] Yen-Po Chen, D. Blaauw, and D. Sylvester. A 266nW multi-chopper amplifier with 1.38 noise efficiency factor for neural signal recording. In 2014 Symposium on VLSI Circuits Digest of Technical Papers, pages 1–2, June 2014.

- [33] F. M. Yaul and A. P. Chandrakasan. A sub-w 36nv/hz chopper amplifier for sensors using a noise-efficient inverter-based 0.2v-supply input stage. In 2016 IEEE International Solid-State Circuits Conference (ISSCC), pages 94–95, Jan 2016.
- [34] M. Maruyama, S. Taguchi, M. Yamanoue, and K. Iizuka. An Analog Front-End for a Multifunction Sensor Employing a Weak-Inversion Biasing Technique With 26 nVrms, 25 aCrms, and 19 fArms Input-Referred Noise. *IEEE Journal of Solid-State Circuits*, 51(10):2252–2261, Oct 2016.
- [35] L. Shen, N. Lu, and N. Sun. A 1v 0.25uw inverter-stacking amplifier with 1.07 noise efficiency factor. In 2017 Symposium on VLSI Circuits, pages C140–C141, June 2017.
- [36] R. Muller, S. Gambini, and J. M. Rabaey. A 0.013mm<sup>2</sup>, 5μW, dc-coupled neural signal acquisition ic with 0.5 v supply. *IEEE Journal of Solid-State Circuits*, 47(1):232–243, Jan 2012.
- [37] R. Mohan, S. Zaliasl, G. G. E. Gielen, C. Van Hoof, R. F. Yazicioglu, and N. Van Helleputte. A 0.6-v, 0.015-mm2, time-based ecg readout for ambulatory applications in 40-nm cmos. *IEEE Journal of Solid-State Circuits*, 52(1):298–308, Jan 2017.
- [38] M. S. J. Steyaert and W. M. C. Sansen. A micropower low-noise monolithic instrumentation amplifier for medical purposes. *IEEE Journal of Solid-State Circuits*, 22(6):1163–1168, Dec 1987.

- [39] L. Shen, N. Lu, and N. Sun. A 1-V 0.25- μW Inverter Stacking Amplifier With 1.07 Noise Efficiency Factor. *IEEE Journal of Solid-State Circuits*, 53(3):896–905, March 2018.
- [40] Willy M Sansen. Analog design essentials, volume 859. Springer Science & Business Media, 2007.
- [41] Qinwen Fan, Kofi A. A. Makinwa, and Johan H. Huijsing. Capacitively Coupled Chopper Amplifiers, pages 29–36. Springer International Publishing, Cham, 2017.
- [42] F. M. Yaul and A. P. Chandrakasan. A Noise-Efficient 36 nV/ √ Hz Chopper Amplifier Using an Inverter-Based 0.2-V Supply Input Stage. *IEEE Journal of Solid-State Circuits*, 52(11):3032–3042, Nov 2017.
- [43] Q. Fan, F. Sebastiano, J. H. Huijsing, and K. A. A. Makinwa. A 1.8μW 60 nV/√Hz Capacitively-Coupled Chopper Instrumentation Amplifier in 65 nm CMOS for Wireless Sensor Nodes. *IEEE Journal of Solid-State Circuits*, 46(7):1534–1543, July 2011.
- [44] D. Gubbins, B. Lee, P. K. Hanumolu, and U. Moon. Continuous-Time Input Pipeline ADCs. *IEEE Journal of Solid-State Circuits*, 45(8):1456– 1468, Aug 2010.
- [45] H. Shibata, V. Kozlov, Z. Ji, A. Ganesan, H. Zhu, D. Paterson, J. Zhao,
   S. Patil, and S. Pavan. A 9-GS/s 1.125-GHz BW Oversampling Continuous-

Time Pipeline ADC Achieving -164-dBFS/Hz NSD. *IEEE Journal of Solid-State Circuits*, 52(12):3219–3234, Dec 2017.

- [46] S. M. Chen and R. W. Brodersen. A 6-bit 600-MS/s 5.3-mW Asynchronous ADC in 0.13-μm CMOS. *IEEE Journal of Solid-State Circuits*, 41(12):2669–2680, Dec 2006.
- [47] V. Tripathi and B. Murmann. An 8-bit 450-MS/s single-bit/cycle SAR ADC in 65-nm CMOS. In 2013 Proceedings of the ESSCIRC (ESSCIRC), pages 117–120, Sep. 2013.
- [48] P. Harpe, C. Zhou, X. Wang, G. Dolmans, and H. de Groot. A 30fJ/conversionstep 8b 0-to-10MS/s asynchronous SAR ADC in 90nm CMOS. In 2010 IEEE International Solid-State Circuits Conference - (ISSCC), pages 388– 389, Feb 2010.
- [49] M. Zhang, K. Noh, X. Fan, and E. Snchez-Sinencio. A 0.81.2 V 1050 MS/s
   13-bit Subranging Pipelined-SAR ADC Using a Temperature-Insensitive Time-Based Amplifier. *IEEE Journal of Solid-State Circuits*, 52(11):2991– 3005, Nov 2017.
- [50] J. Lin, D. Paik, S. Lee, M. Miyahara, and A. Matsuzawa. An Ultra-Low-Voltage 160 MS/s 7 Bit Interpolated Pipeline ADC Using Dynamic Amplifiers. *IEEE Journal of Solid-State Circuits*, 50(6):1399–1411, June 2015.

- [51] C. Liu and M. Huang. 28.1 A 0.46mW 5MHz-BW 79.7dB-SNDR noiseshaping SAR ADC with dynamic-amplifier-based FIR-IIR filter. In 2017 IEEE International Solid-State Circuits Conference (ISSCC), pages 466– 467, Feb 2017.
- [52] L. Shen, Y. Shen, X. Tang, C. Hsu, W. Shi, S. Li, W. Zhao, A. Mukherjee, and N. Sun. 3.4 A 0.01mm2 25μW 2MS/s 74dB-SNDR Continuous-Time Pipelined-SAR ADC with 120fF Input Capacitor. In 2019 IEEE International Solid-State Circuits Conference - (ISSCC), pages 64–66, Feb 2019.
- [53] R. R. Harrison, P. T. Watkins, R. J. Kier, R. O. Lovejoy, D. J. Black,
  B. Greger, and F. Solzbacher. A Low-Power Integrated Circuit for a Wireless 100-Electrode Neural Recording System. *IEEE Journal of Solid-State Circuits*, 42(1):123–133, Jan 2007.
- [54] L. Shen, N. Lu, and N. Sun. A 1-V 0.25- μW Inverter Stacking Amplifier With 1.07 Noise Efficiency Factor. *IEEE Journal of Solid-State Circuits*, 53(3):896–905, March 2018.
- [55] W. Kim, H. Hong, Y. Roh, H. Kang, S. Hwang, D. Jo, D. Chang, M. Seo, and S. Ryu. A 0.6 V 12 b 10 MS/s Low-Noise Asynchronous SAR-Assisted Time-Interleaved SAR (SATI-SAR) ADC. *IEEE Journal of Solid-State Circuits*, 51(8):1826–1839, Aug 2016.
- [56] J. Guerber, H. Venkatram, M. Gande, A. Waters, and U. Moon. A 10b Ternary SAR ADC With Quantization Time Information Utilization.

IEEE Journal of Solid-State Circuits, 47(11):2604–2613, Nov 2012.

- [57] C. C. Lee and M. P. Flynn. A SAR-Assisted Two-Stage Pipeline ADC. IEEE Journal of Solid-State Circuits, 46(4):859–869, April 2011.
- [58] H. Garvik, C. Wulff, and T. Ytterdal. An 11.0 bit ENOB, 9.8 fJ/conv.step noise-shaping SAR ADC calibrated by least squares estimation. In 2017 IEEE Custom Integrated Circuits Conference (CICC), pages 1–4, April 2017.
- [59] M. Gandara, P. Gulati, and N. Sun. A 172dB-FoM pipelined SAR ADC using a regenerative amplifier with self-timed gain control and mixedsignal background calibration. In 2017 IEEE Asian Solid-State Circuits Conference (A-SSCC), pages 297–300, Nov 2017.
- [60] S. Li, B. Qiao, M. Gandara, D. Z. Pan, and N. Sun. A 13-ENOB Second-Order Noise-Shaping SAR ADC Realizing Optimized NTF Zeros Using the Error-Feedback Structure. *IEEE Journal of Solid-State Circuits*, 53(12):3484–3496, Dec 2018.
- [61] M. S. Akter, K. A. A. Makinwa, and K. Bult. A Capacitively Degenerated 100-dB Linear 20-150 MS/s Dynamic Amplifier. *IEEE Journal of Solid-State Circuits*, 53(4):1115–1126, April 2018.
- [62] M. Gandara, W. Guo, X. Tang, L. Chen, Y. Yoon, and N. Sun. A pipelined SAR ADC reusing the comparator as residue amplifier. In

2017 IEEE Custom Integrated Circuits Conference (CICC), pages 1–4, April 2017.

- [63] J. Yoo, L. Yan, D. El-Damak, M. B. Altaf, A. Shoeb, H. Yoo, and A. Chandrakasan. An 8-channel scalable EEG acquisition SoC with fully integrated patient-specific seizure classification and recording processor. In 2012 IEEE International Solid-State Circuits Conference, pages 292– 294, Feb 2012.
- [64] K. A. Ng and Y. P. Xu. A Low-Power, High CMRR Neural Amplifier System Employing CMOS Inverter-Based OTAs With CMFB Through Supply Rails. *IEEE Journal of Solid-State Circuits*, 51(3):724–737, March 2016.
- [65] D. Han, Y. Zheng, R. Rajkumar, G. S. Dawe, and M. Je. A 0.45 v 100channel neural-recording ic with sub-μw/channel consumption in 0.18 μm cmos. *IEEE Transactions on Biomedical Circuits and Systems*, 7(6):735– 746, Dec 2013.
- [66] P. Harpe, H. Gao, R. v. Dommele, E. Cantatore, and A. H. M. van Roermund. A 0.20mm<sup>2</sup>3 nW Signal Acquisition IC for Miniature Sensor Nodes in 65 nm CMOS. *IEEE Journal of Solid-State Circuits*, 51(1):240–248, Jan 2016.
- [67] D. Luo, M. Zhang, and Z. Wang. Design of a 3.24W, 39nV/Hz chopper amplifier with 5.5Hz noise corner frequency for invasive neural signal ac-

quisition. In 2018 IEEE Custom Integrated Circuits Conference (CICC), pages 1–4, April 2018.

- [68] X. Zou, X. Xu, L. Yao, and Y. Lian. A 1-V 450-nW Fully Integrated Programmable Biomedical Sensor Interface Chip. *IEEE Journal of Solid-State Circuits*, 44(4):1067–1077, April 2009.
- [69] J. Xu, B. Bsze, C. Van Hoof, K. A. A. Makinwa, and R. F. Yazicioglu. A 15-Channel Digital Active Electrode System for Multi-Parameter Biopotential Measurement. *IEEE Journal of Solid-State Circuits*, 50(9):2090– 2100, Sep. 2015.
- [70] L. Shen, A. Mukherjee, S. Li, X. Tang, N. Lu, and N. Sun. A 0.6-v tail-less inverter stacking amplifier with 0.96 pef. In 2019 Symposium on VLSI Circuits, pages C144–C145, June 2019.
- [71] R. R. Harrison and C. Charles. A low-power low-noise cmos amplifier for neural recording applications. *IEEE Journal of Solid-State Circuits*, 38(6):958–965, June 2003.
- [72] C. M. Lopez, S. Mitra, J. Putzeys, B. Raducanu, M. Ballini, A. Andrei, S. Severi, M. Welkenhuysen, C. Van Hoof, S. Musa, and R. F. Yazicioglu.
  22.7 a 966-electrode neural probe with 384 configurable channels in 0.13m soi cmos. In 2016 IEEE International Solid-State Circuits Conference (ISSCC), pages 392–393, Jan 2016.

[73] W. Wattanapanitch, M. Fee, and R. Sarpeshkar. An energy-efficient micropower neural recording amplifier. *IEEE Transactions on Biomedical Circuits and Systems*, 1(2):136–147, June 2007.

### Vita

Linxiao Shen received the B.S. degree from Fudan University, Shanghai, China in 2014. He is currently working towards his Ph.D. degree in Electrical and Computer Engineering from University of Texas at Austin. His doctoral work involves the design of energy-efficient sensor readout circuits, mainly for biomedical applications. He was an intern Silicon Laboratories Inc. in summer 2018, working on low-power RC oscillator design.

Mr. Shen was recipient of the IEEE Solid-State Circuits Society Predoctoral Achievement Award in 2019, the Graduate Continuing Fellowship from UT Austin in 2019, the Samsung Fellowship in 2011, and the National scholarship in 2012.

Permanent address: 2501 Speedway Austin, Texas 78712

This dissertation was typeset with  ${\rm I\!A} T_{\rm E} X^{\dagger}$  by the author.

 $<sup>^{\</sup>dagger} \mbox{LeSIe}$  is a document preparation system developed by Leslie Lamport as a special version of Donald Knuth's TEX Program.