Browsing by Subject "Circuit design"
Now showing 1 - 7 of 7
- Results Per Page
- Sort Options
Item Compute-in-memory designs for deep neural network and combinatorial optimization problems accelerators(2023-04-23) Xie, Shanshan, Ph. D.; Kulkarni, Jaydeep P.; Pan, David Z.; Orshansky, Michael; Jia, Yaoyao; Hamzaoglu, FatihThe unprecedented growth in Deep Neural Networks (DNN) model size has resulted into a massive amount of data movement from off-chip memory to on-chip processing cores in modern Machine Learning (ML) accelerators. Compute-In-Memory (CIM) designs performing analog DNN computations within a memory array along with peripheral data converter circuits, are being explored to mitigate this ‘Memory Wall’ bottleneck of latency and energy overheads. Embedded non-volatile magnetic [Wei et al. [2019]; Chih et al. [2020]; Dong et al. [2018]; Shih et al. [2019]], and resistive [Jain et al. [2019]; Chou et al. [2020]; Chang et al. [2014]; Lee et al. [2017]] as well as standalone Flash memories suffer from lower write-speeds and poor write-endurance and can’t be used for programmable accelerators requiring fast and frequent model updates. Similarly, cost-sensitive commodity DRAM (Dynamic Random Access Memory) can’t be leveraged for high-speed, custom CIM designs due to limited metal layers and dense floorplan constraints often leading to compute-near-memory designs limiting its throughput benefits [Aga et al. [2019]]. Among the prevalent semiconductor memories, eDRAM (embedded-DRAM) which integrates the DRAM bitcell monolithically along with high-performance logic transistors and interconnects can enable custom CIM designs by offering the densest embedded bitcell, low pJ/bit access energy, high-endurance, high-performance, and high-bandwidth; all desired attributes for ML accelerators [Fredeman et al. [2015]; Berry et al. [2020]]. Yet, eDRAM has been used in niche applications due to its high cost/bit, low retention time, and high noise sensitivity. On the DNN algorithms front, the landscape is rapidly changing with the adoption of 8-bit integer arithmetic for both DNN inference and training algorithms [Jouppi et al. [2017]; Yang et al. [2020]]. These reduced bit-width computations are extremely conducive for CIM designs which have shown promising results for integer arithmetic [Biswas and Chandrakasan [2018]; Gonugondla et al. [2018a]; Zhang et al. [2017]; Si et al. [2019]; Yang et al. [2019]; Khwa et al. [2018]; Chen et al. [2019]; Dong et al. [2020]; Valavi et al. [2019]; Dong et al. [2017]; Jiang et al. [2019]; Yin et al. [2020]]. Thus, high cost/bit of eDRAM can now be amortized by repurposing existing eDRAM in high-end processors for enabling CIM circuits. Despite the potential of eDRAM technology and the progress in DNN integer arithmetic, no hardware demonstration for eDRAM-based CIM design has been reported so far. Therefore, in this dissertation, the first project explores the compute-in-memory concept with the dense 1T1C eDRAM bitcells as charge domain circuits for convolution neural network (CNN) multiply-accumulation-averaging (MAV) computation. This method minimizes area overhead by leveraging existing 1T1C eDRAM columns to construct an adaptive data converter, dot-product, averaging, pooling, and ReLU activation on the memory array. The second project presents a leakage and read bitline (RBL) swing-aware compute-in-memory (CIM) design leveraging a promising high-density gain-cell embedded DRAM bitcell and the intrinsic RBL capacitors to perform CIM computations within the limited RBL swing available in a 2T1C eDRAM. The CIM D/A converters (DAC) are realized intrinsically with variable RBL precharge voltage levels. A/D converters (ADC) are realized using Schmitt Triggers (ST) as compact and reconfigurable Flash comparators. Similar to machine learning applications, combinatorial optimation problems (COP) also require data-intensive computations, which are naturally suitable for adopting the compute-in-memory concept as well. Combinatorial optimization problems find many real-world social and industrial data-intensive computing applications. Examples include optimization of mRNA sequences for COVID-19 vaccines [Leppek et al. [2021]; Pardi et al. [2018]], semiconductor supply-chains [Crama [1997]; Kempf [2004]], and financial index tracking [Benidis et al. [2018]], to name a few. Such COPs are predominantly NP-hard [Yuqi Su and Kim [2020]], and performing an exhaustive brute force search becomes untenable as the COP size increases. An efficient way to solve COPs is to let nature perform the exhaustive search in the physical world using the Ising model, which can map many types of COPs [Lucas [2014]], The Ising model describes spin dynamics in a ferromagnetic [Peierls [1936]], wherein spins naturally orient to achieve the lowest ensemble energy state of the Ising model, representing the optimal COP solution [Yoshimura et al. [2015]]. Therefore, in order to accelerate the COP computations, the third project focuses on implementing analog compute-in-memory techniques for Ising computation to eliminate unnecessary data movement and to reduce energy costs. The COPs can be mapped into a generic Ising model framework, and the computations are performed directly on the bitlines. Spin updates are performed locally using the existing sense amplifier in the peripheral circuits and the write-after-read mechanism in the memory array controller. Beyond that, the fourth project explores the CIM designs for solving Boolean Satisfiability (SAT) problems, which s a non-deterministic polynomial time (NP)-complete problems with many practical and industrial data-intensive applications. An all-digital SAT solver, called Snap-SAT, is presented to accelerate the iterative computations using the static random-access memory (SRAM) array to reduce the frequent memory access and minimize the hardware implementation cost. This design demonstrates a promising, fast, reliable, reconfigurable, and scalable compute-in-memory design for solving and accelerating large-scale hard SAT problems, suggesting its potential for solving time-critical SAT problems in real-life applications (e.g., defense, vaccine development, etc.)Item Coupled passive resonant circuits as battery-free wireless sensors(2010-05) Pasupathy, Praveenkumar; Neikirk, Dean P., 1957-; Wood, Sharon L.; Arapostathis, Aristotle; Dodabalapur, Ananth; Hassibi, ArjangDetection and monitoring of the damage created by the corrosion of the steel reinforcement in concrete structures is a challenging and multidisciplinary problem. Economical monitoring strategy that is long-term and nondestructive requires low-cost, battery-free, wireless sensors. Our Electronic Structural Surveillance (ESS) platform uses battery-free passive resonant circuit (tag) as a sensor. The tag is magnetically coupled to an external reader coil. It is interrogated/read remotely in a non-contact (wireless) manner and the state of the sensor is determined from a swept frequency impedance measurement. When paired with the correct sensing element (transducer), the tag can be used for a variety of sensing applications for example, chemical & biochemical sensors. A circuit model of the reader and tag for such a universal battery-free wireless sensor platform is developed. The interaction between design and detection limit is examined. The dependence of the measured signal strength and read range on the various reader and tag circuit parameters is analyzed. Since the values of the circuit of the coils are dependent on their geometries, the effect of specific coil geometry is evaluated and design recommendations are made.Item Design of circuits for sub-threshold voltages : implementation of adders(2016-05) Giliyar Shanthir, Ankith; Swartzlander, Earl E., Jr., 1945-; Touba, Nur A.The demand and the need for low-power circuits is an ever increasing trend particularly due to the added overhead of design of efficient cooling systems or more sophisticated and expensive packaging techniques. In most new emerging applications that demand low power consumption such as biomedical implants, wearable devices, micro-sensor nodes and countless others, energy efficiency emphasis far supersedes the traditional focus on improving the speed. Such energy constrained systems can be operated at considerably reduced performance levels in order to save power and extend their battery lifetimes. Sub-Threshold design has proven useful for ultra-low power and low energy applications since the dynamic power is reduced quadratically with supply voltage; the least energy operation usually takes place in the sub-threshold region. This work provides a comprehensive analysis of the CMOS standard cell characterization in the sub-threshold region, layout, logical library extraction, optimization and top-level implementation of 2 of the parallel prefix adders of different word sizes in 45nm technology with comparison between the sub-threshold region and strong inversion regions of operation. The analysis is done on PPA: power (energy), performance and area, the common metrics for any chip design. The switching activities of the circuits were captured using dynamic gate level simulation to perform the time based peak power analysis. Static timing analysis was performed to estimate the delay of the critical path for each circuit. The analysis and results presented in this report will be helpful in choosing a specific adder configuration for an integrated circuit based on the constraints related to its application.Item Device modeling and circuit design for ZTO based amorphous metal oxide TFTs(2011-05) Joshi, Tanvi Dhananjay; Viswanathan, T. R., doctor of electrical engineering; Dodabalapur, Ananth, 1963-Amorphous Oxide semiconductors have gained large interest in the display industry owing to their high carrier mobilities and low fabrication costs. In this thesis, n-channel solution based zinc-tin oxide (ZTO) thin-film transistors (TFTs) are studied from a circuit design perspective. The study includes an iterative process of circuit design, layout and test procedure of the fabricated devices in the lab. The device models used in circuit simulations are refined following the data fed back from each of these iterations which has enabled more accurate design of complex circuits using ZTO devices. The requirement and development of a physical compact model for performing accurate and predictive circuit simulations has been presented. The use of ZTO devices in low cost, transparent and flexible electronic applications has been investigated through the study of basic circuit blocks such as amplifiers, ring oscillators, inverters and a four stage Operational Amplifier.Item Dynamic power reduction using data gating(2006-05) Kumar, Amit, 1978-; Ambler, TonyThere has been a constant need for low power techniques to achieve high performance at the lowest possible power dissipation. Lots of works have been done to achieve this target. These works have focused on the different aspects of power reduction. One of these aspects of power saving is Dynamic power reduction. This thesis work is focused on this aspect of power saving by reducing the unnecessary transitioning in the circuit. To achieve this, new method called data gating, is proposed here which stops unnecessary toggling in the circuit using different forms of gating mechanisms. This thesis is organized as follows; first chapter is about the low power design of CMOS circuits. That chapter covers the sources of power dissipation in ICs as well as the techniques that have been used to minimize the power consumption. Second chapter talks more about dynamic power consumption. Techniques used for reducing dynamic power consumption through reduction in switching activities are mentioned in that chapter. Also the new technique, Data Gating, to reduce dynamic power is proposed in second chapter. Third chapter talks about simulation setup, tools used for simulation. Results obtained from different simulations are presented in that chapter. Fourth Chapter is about the analysis of simulation results. It also outlines some possible limitations of the proposed method as well as certain points that need to be considered before applying new technique. Fifth and final chapter summarizes the conclusion and possible future work that can be done to enhance the proposed technique, Data Gating.Item Physical design automation of structured high-performance integrated circuits(2013-12) Ward, Samuel Isaac; Pan, David Z.During the last forty years, advancements have pushed state-of-the-art placers to impressive performance placing modern multimillion gate designs in under an hour. Wide industry adoption of the analytical framework indicates the quality of these approaches. However, modern designs present significant challenges to address the multi objective requirements for multi GHz designs. As devices continue to scale, wires become more resistive and power constraints significantly dampen performance gains, continued improvement in placement quality is necessary. Additionally, placement has become more challenging with the integration of multi-objective constraints such as routability, timing and reliability. These constraints intensify the challenge of producing quality placement solutions and must be handled carefully. Exasperating the issue, shrinking schedules and budgets are requiring increased automation by blurring the boundary between manual and automated placement. An example of this new hybrid design style is the integration of structured placement constraints within traditional ASIC style circuit structures. Structure aware placement is a significant challenge to modern high performance physical design flows. The goal of this dissertation is to develop enhancements to state-of-the-art placement flows overcoming inadequacies for structured circuits. A key observation is that specific structures exist where modern analytical placement frameworks significantly underperform. Accurately measuring suboptimality of a particular placement solution however is very challenging. As such, this work begins by designing a series of structured placement benchmarks. Generating placement for the benchmarks manually offers the opportunity to accurately quantify placer performance. Then, the latest generation of academic placers is compared to evaluate how the placers performed for these design styles. Results of this work lead to discoveries in three key aspects of modern physical design flows. Datapath placement is the first aspect to be examined. This work narrows the focus to specifically target datapath style circuits that contain high fanout nets. As the datapath benchmarks showed, these high fanout nets misdirect analytical placement flows. To effectively handle these circuit styles, this work proposes a new unified placement flow that simultaneously places random-logic and datapath cells. The flow is built on top of a leading academic force-directed placer and significantly improves the quality of datapath placement while leveraging the speed and flexibility of existing algorithms. Effectively placing these circuits is not enough because in modern high performance designs, datapath circuits are often embedded within a larger ASIC style circuit and thus are unknown. As such, the next aspect of structured placement applies novel data learning techniques to train, predict, and evaluate potential structured circuits. Extracted circuits are mapped to groups that are aligned and simultaneously placed with random logic. The third aspect that can be enhanced with improved structured placement impacts local clock tree synthesis. Performance and power requirements for multi-GHz microprocessors necessitate the use of a grid-based clock network methodology, wherein a global clock grid is overlaid on the entire die area followed by local buffered clock trees. This clock mesh methodology is driven by three key reasons: First, full trees do not offer enough performance for modern microprocessors. Second, clock trees offer significant power savings over full clock meshes. Third, local clock trees reduce the local clock wiring demands compared to full meshes at lower level metal layers. To meet these demands, a shift in latch placement methodology is proposed by using structured placement templates. Placement configurations are identified a priori with significantly lower capacitance and the solutions are developed into placement templates. Results through careful experimentation demonstrate the effectiveness of these approaches and the impact potential for modern high-speed designs.Item Power supply noise management : techniques for estimation, detection, and reduction(2010-12) Wu, Tung-Yeh; Abraham, Jacob A.; Gerosa, Gianfranco; Orshansky, Michael E.; Pan, David Z.; Yu, Shu-YiPower supply noise has become a critical issue for low power and high performance circuit design in recent years. The rapid scaling of the CMOS process has pushed the limit further and further in building low-cost and increasingly complex digital VLSI systems. Continued technology scaling has contributed to significant improvements in performance, increases in transistor density, and reductions in power consumption. However, smaller feature sizes, higher operation frequencies, and supply voltage reduction make current and future VLSI systems more vulnerable to power supply noise. Therefore, there is a strong demand for strategies to prevent problems caused by power supply noise. Design challenges exist in different design phases to reduce power supply noise. In terms of physical design, careful power distribution design is required, since it directly determines the quality of power stability and the timing integrity. In addition, power management, such as switching mode of the power gating technique, is another major challenge during the circuit design phase. A bad power gating switching strategy may draw an excessive rush current and slow down other active circuitry. After the circuit is implemented, another critical design challenge is to estimate power supply noise. Designers need to be aware of the voltage drop in order to enhance the power distribution network without wasting unnecessary design resources. However, estimating power supply noise is usually difficult, especially finding the circuit activity which induces the maximum supply noise. Blind search may be very time consuming and not effective. At post-silicon test, detecting power supply noise within a chip is also challenging. The visibility of supply noise is low since there is no trivial method to measure it. However, the supply noise measurement result on silicon is critical to debug and to characterize the chip. This dissertation focuses on novel circuit designs and design methodologies to prevent problems resulted from power supply noise in different design phases. First, a supply noise estimation methodology is developed. This methodology systematically searches the circuit activity inducing the maximum voltage drop. Meanwhile, once the circuit activity is found, it is validated through instruction execution. Therefore, the estimated voltage drop is a realistic estimation close to the real phenomenon. Simulation results show that this technique is able to find the circuit activity more efficiently and effectively compared to random simulation. Second, two on-chip power supply noise detectors are designed to improve the visibility of voltage drop during test phase. The first detector facilitates insertion of numerous detectors when there is a need for additional test points, such as a fine-grained power gating design or a circuit with multiple power domains. It focuses on minimizing the area consumption of the existing detector. This detector significantly reduces the area consumption compared to the conventional approach without losing accuracy due to the area minimization. The major goal of designing the second on-chip detector is to achieve self-calibration under process and temperature variations. Simulation and silicon measurement results demonstrate the capability of self-calibration regardless these variations. Lastly, a robust power gating reactivation technique is designed. This reactivation scheme utilizes the on-chip detector presented in this dissertation to monitor power supply noise in real time. It takes a dynamic approach to control the wakeup sequence according to the ambient voltage level. Simulation results demonstrate the ability to prevent the excessive voltage drop while the ambient active circuitry induces a high voltage drop during the wakeup phase. As a result, the fixed design resource, which is used to prevent the voltage emergency, can potentially be reduced by utilizing the dynamic reactivation scheme.