Browsing by Subject "Overlay"
Now showing 1 - 8 of 8
- Results Per Page
- Sort Options
Item Automatically specialized FPGA overlays : a trade-off between programmability, performance, and area(2024-05) Ma, Rui, Ph. D.; Chiou, Derek; Pan, Zhigang; Nurvitadhi, Eriko; Erez, Mattan; Gerstlauer, AndreasHardware acceleration is a promising approach to improve computing performance. FPGAs can implement accelerators orders of magnitude more performant than software solutions on standard programmable hardware. However, FPGAs still suffer from long development and deployment time. High-level synthesis [87, 35, 73, 16] has been proposed to improve FPGA productivity but generally requires users to have a deep understanding of the tools and target hardware to produce efficient designs. FPGA overlays map processor architectures on top of FPGAs, allowing applications to change without reconfiguring FPGAs and enabling fast development and deployment on FPGAs by providing software programmability. Though being less efficient than an identical processor implemented in hard logic on the same technology node, the overlay itself can be specialized over time to further improve its performance. Though there have been prior work building or generating domain-specialized overlays, none of them provides an end-to-end solution to automatically generate specialized overlays without non-trivial user intervention. This thesis proposes a framework to automatically generate overlays based on user constraints, exploring the trade-off between programmability, performance, and area. It first demonstrates a microarchitecture template for generating specialized overlays. Then it presents toolchains to automatically configure the microarchitecture template and generate specialized function units, based on static and profile-guided analysis of the workloads. The results show that the overlays can be quickly generated and their performance is better than the best CPU solution and comparable to the specialized FPGA solution.Item Dual field nano precision overlay(2010-08) Yin, Bailey Anderson; Sreenivasan, S. V.; Rodin, Gregory J.Currently, the imprint lithography steppers are designed to only pattern one field of 26 x 33 mm at a time. This choice is based on the desire to mix-and-match to the standard optical lithography tools whose field size is also 26 x 33 mm. Throughput can be increased if more than one field can be imprinted simultaneously. The problem with adding a field to the imprinting template is that each field has overlay errors associated with it that are created when the template is manufactured and when the corresponding prior field is manufactured on the wafer. The current process is able to correct these template and wafer overlay errors using a precision stage and actuators that elastically deform the template. The same method cannot be used when there are two fields because the fields are not independent and interact with each other. Correcting the errors in one of the fields tend to increase the error in the second field. vii In this thesis, a new control method has been created to account for the dependent motion. A new template concept was also created to try to limit the interaction between the two fields. The new control algorithm was tested in simulation to see if it could correct the current 1-field setup as well as the new concept of having more than one field on a template. The control algorithm was also used to test applications where the overlay errors in only one direction need to be corrected. The control algorithm was tested on a solid single field template, the baseline case, and was able to achieve 1.3 nm overlay, which is consistent with the current method. The algorithm was then tested on the dual field concepts. The range of alignment errors needed to get 5 nm overlay are too tight for current manufacturing but the compliant concept did have more relaxed ranges than the solid dual field template. With more research, the compliant template concept might be changed to allow for wider ranges. The tests with correction in only one direction had promising data that should be investigated further.Item Evaluation and extension of threaded control for high-mix semiconductor manufacturing(2010-12) Patwardhan, Ninad Narendra; Flake, Robert H.; Edgar, Thomas F.In the recent years threaded run-to-run (RtR) control algorithms have experienced drawbacks under certain circumstances, one such trait is when applied to high-mix of products such as in Application Specific Integrated Circuits (ASIC) foundries. The variations in the process are a function of the product being manufactured as well as the tool being used. The presence of semiconductor layers increases the number of times the lithography process must be repeated. Successive layers having different patterns must be exposed using different reticles/masks in order to maximize tool utilizations. The objectives of this research are to develop a set of methodologies for evaluation and extension of threaded control applied to overlay. This project defines methods to quantify the efficacy of threaded controls, finds the drawbacks of threaded control under production of high mix of semiconductors and suggests extensions and alternatives to improve threaded control. To evaluate the performance of threaded control, extensive simulations were performed in MATLAB. The effects of noise, disturbances, sampling and delays on the control and estimation performance of threaded controller were studied through these simulations. Based on the results obtained, several ideas to extend threaded control by reducing overall number of threads, by improving thread definitions and combination have been introduced. A unique idea of sampling the measurements dynamically based on the estimation accuracy is also presented. Future work includes implementing the extensions to threaded control suggested in this work in real production data and comparing the results without the use of those methods. Future work also includes building new alternatives to threaded control.Item An investigation of means of mitigating alkali-silica reaction in hardened concrete(2013-05) Markus, Reid Patrick; Folliard, Kevin J.This research project, funded by the Federal Highway Administration (FHWA Project DTFH61-02-C-0097), focuses mainly on alkali-silica reaction (ASR) and techniques to mitigate the effects of alkali-silica reaction in hardened concrete. A large portion of this report discusses the construction and design of an outdoor exposure site built at the University of Texas at Austin where the goal was to cast field representative concrete elements with laboratory precision and expose them to real environmental conditions. The elements were monitored for expansion and deterioration. At discrete expansion levels a range of mitigation methods were implemented on the structures. After the concrete elements were treated, long-term monitoring was conducted to determine the best approach to provide effective suppression of alkali-silica reaction in the various element types.Item Large-scale network analytics(2011-08) Song, Han Hee, 1978-; Zhang, Yin, doctor of computer scienceScalable and accurate analysis of networks is essential to a wide variety of existing and emerging network systems. Specifically, network measurement and analysis helps to understand networks, improve existing services, and enable new data-mining applications. To support various services and applications in large-scale networks, network analytics must address the following challenges: (i) how to conduct scalable analysis in networks with a large number of nodes and links, (ii) how to flexibly accommodate various objectives from different administrative tasks, (iii) and how to cope with the dynamic changes in the networks. This dissertation presents novel path analysis schemes that effectively address the above challenges in analyzing pair-wise relationships among networked entities. In doing so, we make the following three major contributions to large-scale IP networks, social networks, and application service networks. For IP networks, we propose an accurate and flexible framework for path property monitoring. Analyzing the performance side of paths between pairs of nodes, our framework incorporates approaches that perform exact reconstruction of path properties as well as approximate reconstruction. Our framework is highly scalable to design measurement experiments that span thousands of routers and end hosts. It is also flexible to accommodate a variety of design requirements. For social networks, we present scalable and accurate graph embedding schemes. Aimed at analyzing the pair-wise relationships of social network users, we present three dimensionality reduction schemes leveraging matrix factorization, count-min sketch, and graph clustering paired with spectral graph embedding. As concrete applications showing the practical value of our schemes, we apply them to the important social analysis tasks of proximity estimation, missing link inference, and link prediction. The results clearly demonstrate the accuracy, scalability, and flexibility of our schemes for analyzing social networks with millions of nodes and tens of millions of links. For application service networks, we provide a proactive service quality assessment scheme. Analyzing the relationship between the satisfaction level of subscribers of an IPTV service and network performance indicators, our proposed scheme proactively (i.e., detect issues before IPTV subscribers complain) assesses user-perceived service quality using performance metrics collected from the network. From our evaluation using network data collected from a commercial IPTV service provider, we show that our scheme is able to predict 60% of the service problems that are complained by customers with only 0.1% of false positives.Item Materials selection for concrete overlays(2011-08) Kim, Dong Hyun, 1984-; Ferron, Raissa; Fowler, David W.Concrete overlays have been a rehabilitation method for many years. It has been extensively utilized and studied in other states, but Texas is still at an initial stage of fully implementing the method. The large volume of concrete highways in Texas makes bonded concrete overlays, unbonded concrete overlays, and whitetoppings very viable options. However, there is a lack of educational guidelines for pavement engineers for concrete overlay construction. This research presents the information gathered from literature review, condition survey, and evaluation of existing concrete overlays in Texas. Also, a laboratory research was performed for recommendations for materials selection and construction for concrete overlays. From these, guidelines for materials selection and construction method developed that will assist in future concrete overlays in Texas are presented.Item Methods for nano-precise overlay in advanced in pick-and-place assembly(2019-08) Ajay, Paras; Sreenivasan, S. V.; Banerjee, Sanjay; Djurdjanovic, Dragan; Hall, Neal; Kulkarni, JaydeepWe have explored two nanofabrication techniques in this thesis – Jet-and-Flash Imprint Lithography (J-FIL), and Nano-precise Modular Assembly of Pre-fabricated blocks (N-MAP), with a focus on nano-precise overlay in these techniques. J-FIL currently uses template shape-and-size control, along with x-y-q substrate position control, to overlay template patterns to substrate patterns. This method is designed to mix-and-match with photolithography (PL) and corrects distortions over a limited field size of 26mm x 33mm. Template shape-and-size control, by itself, does not perform well for multi-field templates because of inter-field mechanical coupling, along with limits on the maximum lateral forces that can be applied on the template. We have explored new methods to achieve sub-5nm overlay in multi-field patterning, including thermal-actuation based substrate-distortion-control, and genetic algorithm-based template topology-optimization. Separately, we have realized that localized overlay errors can be introduced in J-FIL due to sub-micron sized particles, which are often present between the wafer and the wafer chuck. We have developed new compliant chuck designs to address this problem. While the above overlay control methods are developed for J-FIL, they are set up to ultimately be integrated into another technique – N-MAP. This technique has the potential to facilitate a variety of new nanofabrication applications, including 3D ICs and secure super-sized SoCs, by enabling nano-precise assembly of pre-fabricated device blocks (PFBs). N-MAP incorporates a vacuum-based superstrate (in place of the template in J-FIL), which is designed to pick up PFBs and to place them at precise locations on a product substrate. In this dissertation we present process flows for N-MAP, designs for the vacuum-based superstrate, as well as a simulation framework to analyze the pickup and placement mechanics of N-MAP. Additionally, we’ve explored various technology options for the source wafers in N-MAP, which need to contain a buried sacrificial layer to enable nano-precise pick-and-place assemblyItem Nanometer VLSI design-manufacturing interface for large scale integration(2011-05) Yang, Jae-Seok; Pan, David Z.; Abraham, Jacob; Orshansky, Michael; Liu, Frank; Touba, NurAs nanometer Very Large Scale Integration (VLSI) demands more transistor density to fabricate multi-cores and memory blocks in a limited die size, many researches have been performed to keep Moore's Low in two different ways: 2D geometric shrinking and 3D vertical wafer stacking. For the geometric shrinking, nano patterning with 193nm lithography equipment is one of the most fundamental challenges beyond 22nm while the next-generation lithography, such as Extreme Ultra-Violet (EUV) lithography still faces tremendous challenges for volume production in the near future. As a practical solution, Double Patterning Lithography (DPL) has become a leading candidate for sub-20nm lithography process. Another approach for multi-core integration is 3D wafer stacking with Through Silicon Via (TSV). Computer-Aided-Design (CAD) approaches to enable robust DPL and TSV technology are the main focus of this dissertation. DPL poses new challenges for overlay and layout decomposition. Therefore, overlay induced variation modeling and efficient decomposition for better manufacturability are in great demand. Since the variation of metal space caused by overlay results in coupling capacitance variation, we first model metal spacing variation with individual overlay sources. Then, all overlay sources are considered to determine the worst timing with coupling capacitance variation. Non-parallel pattern caused by overlay is converted to parallel one with equivalent spacing having the same delay to be applicable of a traditional RC extraction flow. Our experiments show that the delay variation due to overlay in DPL can be up to 9.1%, and well decomposed layout can reduce the variability. For DPL layout decomposition, we propose a multi-objective and flexible framework for stitch minimization, balanced density, and overlay compensation, simultaneously. We use a graph theoretic algorithm for minimum stitch insertion and balanced density. Additional decomposition constraints for overlay compensation are obtained by Integer Linear Programming (ILP). Robust contact decomposition can be obtained with additional constraints. With these constraints, global decomposition is performed using a modified Fiduccia-Mattheyses (FM) graph partitioning algorithm. Experimental results show that the proposed framework is highly scalable and fast: we can decompose all 15 benchmark circuits in five minutes in a density balanced fashion, while an ILP-based approach can finish only the smallest five circuits. In addition, we can remove more than 95% of the timing variation induced by overlay for tested structures. Three-dimensional integration has new manufacturing and design challenges such as device variation due to TSV induced stress and timing corner mismatch between different stacked dies. Since TSV fill material and silicon have different Coefficients of Thermal Expansion (CTE), TSV causes silicon deformation due to different temperatures at chip manufacturing and operating. Therefore, the systematic variation due to TSV induced stress should be considered for robust 3D IC design. We propose systematic TSV stress aware timing analysis and show how to optimize layout for better performance. First, a stress contour map with an analytical radial stress model is generated. Then, the tensile stress is converted to hole and electron mobility variations depending on geometric relations between TSVs and transistors. Mobility variation aware cell library and netlist are generated and incorporated in an industrial timing engine for 3D-IC timing analysis. TSV stress induced timing variations can be as much as 10% for an individual cell. As an application for layout optimization, we can exploit the stress-induced mobility enhancement to improve timing on critical cells. We show that stress-aware perturbation could reduce cell delay by up to 14.0% and critical path delay by 6.5% in our test case. Three-dimensional Clock Tree Synthesis (3D CTS) is one of the main design difficulties in 3D integration because clock network is spreading over all tiers. In 3D CTS, timing corner mismatch between tiers is caused because each tier is manufactured in independent process. Therefore, inter-die variation should be considered to analyze and optimize for paths spreading over several tiers in 3D CTS. In addition, mobility variation of a clock buffer due to stress from TSV can cause unexpected skew which degrades overall chip performance. Therefore, we propose clock period optimization to consider both timing corner mismatch and TSV induced stress. In our experiments, we show that our clock buffer tier assignment reduces clock period variation up to 34.2%, and the most of stress-induced skew can be removed by our stress-aware CTS. Overall, we show that performance gain can be up to 5.7% with the proposed CTS algorithm. As technology scaling continues toward 14nm and 3D-integration, this dissertation addresses several key issues in the design-manufacturing interface, and proposes unified analysis and optimization techniques for effective design and manufacturing integration.