Low power circuit design techniques for edge computing

Access full-text files



Journal Title

Journal ISSN

Volume Title



In the booming era of Internet-of-Things (IoT), the trend of pushing inference from cloud to edge due to concerns of latency, bandwidth, and privacy has created a demand for energy-efficient edge computing devices. The edge computing devices have been the critical building blocks in modern electronic systems, supporting various applications such as neural network inference, mobile healthcare monitoring, and human-machine interface. To improve the energy efficiency of edge devices, the author worked in three directions: 1) developing a ternary neural network accelerator achieving higher energy-efficiency than state-of-the-art binary neural network; 2) developing a 4-bit neural network accelerator with one-shot ADC conversion for the entire MAC array; 3) a long-term, real-time muscle fatigue detection device with ultrathin, ultrasoft, and long-term stable dry epidermal electrodes. In the first part, we propose a mixed-signal ternary CNN-based processor featuring higher energy efficiency than BNN. It confers several key improvements: 1) the proposed ternary network provides 1.5-b resolution (0/+1/-1), leading to 3.9x OPs/inference reduction than BNN for the same MNIST accuracy; 2) a 1.5b multiply-and-accumulate (MAC) is implemented by VCM-based capacitor switching scheme, which inherently benefits from the reduced signal swing on the capacitive DAC (CDAC); 3) the VCM-based MAC introduces sparsity during training, resulting in lower switching rate. With a complete neural network on chip, the proposed design realizes 97.1% MNIST accuracy with only 0.18μJ per classification, presenting the highest power efficiency for comparable MNIST accuracy. The second part of this dissertation focuses on a 4-bit MAC macro. This work proposes a mixed-signal MAC macro that requires only 1 ADC operation for the entire 512 4b×4b MAC. This is achieved by mapping 9 partial products onto 5 wires based on their relative weights, dynamic buffering 5 wire voltages, and sampling them on properly sized SAR ADC capacitors. As a result, all MAC operations are finished in the charge domain by the end of the ADC sampling, allowing only 1 A/D conversion per multi-bit MAC. To further increase power efficiency, window-based comparison skipping and ReLU are embedded inside the SAR ADC, so that unnecessary comparison cycles are skipped for small or negative MAC outputs. Overall, despite using a 65nm process, the prototype chip achieves an energy efficiency of 164 TOPS/W for a 4-b MAC. Finally, this dissertation also presents a long-term, real-time muscle fatigue monitoring system consisting of 1) a hair-thin, skin-soft and mechanically robust e-tattoo electrode which is less susceptible to motion artifacts and capable of multi-day monitoring, 2) a battery-powered edge computing flexible printed circuit (FPC) which extracts instantaneous median frequency (IMDF) of surface electromyography (sEMG) bursts and wirelessly streams them to a mobile application. The system consumes an average of 33 mA current, supporting 25 hours of continuous operation, and could be extended into multiple days if only activated intermittently.


LCSH Subject Headings