Nanometer VLSI design-manufacturing interface for large scale integration
MetadataShow full item record
As nanometer Very Large Scale Integration (VLSI) demands more transistor density to fabricate multi-cores and memory blocks in a limited die size, many researches have been performed to keep Moore's Low in two different ways: 2D geometric shrinking and 3D vertical wafer stacking. For the geometric shrinking, nano patterning with 193nm lithography equipment is one of the most fundamental challenges beyond 22nm while the next-generation lithography, such as Extreme Ultra-Violet (EUV) lithography still faces tremendous challenges for volume production in the near future. As a practical solution, Double Patterning Lithography (DPL) has become a leading candidate for sub-20nm lithography process. Another approach for multi-core integration is 3D wafer stacking with Through Silicon Via (TSV). Computer-Aided-Design (CAD) approaches to enable robust DPL and TSV technology are the main focus of this dissertation. DPL poses new challenges for overlay and layout decomposition. Therefore, overlay induced variation modeling and efficient decomposition for better manufacturability are in great demand. Since the variation of metal space caused by overlay results in coupling capacitance variation, we first model metal spacing variation with individual overlay sources. Then, all overlay sources are considered to determine the worst timing with coupling capacitance variation. Non-parallel pattern caused by overlay is converted to parallel one with equivalent spacing having the same delay to be applicable of a traditional RC extraction flow. Our experiments show that the delay variation due to overlay in DPL can be up to 9.1%, and well decomposed layout can reduce the variability. For DPL layout decomposition, we propose a multi-objective and flexible framework for stitch minimization, balanced density, and overlay compensation, simultaneously. We use a graph theoretic algorithm for minimum stitch insertion and balanced density. Additional decomposition constraints for overlay compensation are obtained by Integer Linear Programming (ILP). Robust contact decomposition can be obtained with additional constraints. With these constraints, global decomposition is performed using a modified Fiduccia-Mattheyses (FM) graph partitioning algorithm. Experimental results show that the proposed framework is highly scalable and fast: we can decompose all 15 benchmark circuits in five minutes in a density balanced fashion, while an ILP-based approach can finish only the smallest five circuits. In addition, we can remove more than 95% of the timing variation induced by overlay for tested structures. Three-dimensional integration has new manufacturing and design challenges such as device variation due to TSV induced stress and timing corner mismatch between different stacked dies. Since TSV fill material and silicon have different Coefficients of Thermal Expansion (CTE), TSV causes silicon deformation due to different temperatures at chip manufacturing and operating. Therefore, the systematic variation due to TSV induced stress should be considered for robust 3D IC design. We propose systematic TSV stress aware timing analysis and show how to optimize layout for better performance. First, a stress contour map with an analytical radial stress model is generated. Then, the tensile stress is converted to hole and electron mobility variations depending on geometric relations between TSVs and transistors. Mobility variation aware cell library and netlist are generated and incorporated in an industrial timing engine for 3D-IC timing analysis. TSV stress induced timing variations can be as much as 10% for an individual cell. As an application for layout optimization, we can exploit the stress-induced mobility enhancement to improve timing on critical cells. We show that stress-aware perturbation could reduce cell delay by up to 14.0% and critical path delay by 6.5% in our test case. Three-dimensional Clock Tree Synthesis (3D CTS) is one of the main design difficulties in 3D integration because clock network is spreading over all tiers. In 3D CTS, timing corner mismatch between tiers is caused because each tier is manufactured in independent process. Therefore, inter-die variation should be considered to analyze and optimize for paths spreading over several tiers in 3D CTS. In addition, mobility variation of a clock buffer due to stress from TSV can cause unexpected skew which degrades overall chip performance. Therefore, we propose clock period optimization to consider both timing corner mismatch and TSV induced stress. In our experiments, we show that our clock buffer tier assignment reduces clock period variation up to 34.2%, and the most of stress-induced skew can be removed by our stress-aware CTS. Overall, we show that performance gain can be up to 5.7% with the proposed CTS algorithm. As technology scaling continues toward 14nm and 3D-integration, this dissertation addresses several key issues in the design-manufacturing interface, and proposes unified analysis and optimization techniques for effective design and manufacturing integration.