Browsing by Subject "Voltage noise"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Characterization of voltage noise in big, small and single-ISA heterogeneous systems(2013-05) Garg, Ankita; John, Lizy Kurian; Reddi, Vijay JanapaSensitivity of the microprocessor to voltage fluctuations is becoming a major concern with growing emphasis on designing power-efficient microprocessors. Voltage fluctuations that exceed a certain threshold cause "emergencies" that can lead to timing errors in the processor, thus risking reliability. To guarantee correctness under such conditions, large voltage guardbands are employed, at the cost of reduced performance and wastage of power. Trends in microprocessor technology indicate that worst-case operating voltage margins are not sustainable. Since voltage emergencies occur only infrequently, resilient architectures with aggressive guardbands are needed. However, to enable the exploration of the design space of resilient processors, it is important to have a deep understanding of the characteristics of voltage noise in different system configurations. Prior research in this area has mostly focused on systems with very few cores. Given the increasing relevance of large multi-core systems, this thesis presents a detailed characterization of voltage noise on chip multi-processors, consisting of large number of cores. The data indicates that while the worst case voltage droop increases with increase in the number of cores, the frequency of occurrence of the droops is not greatly impacted, emphasizing the feasibility of employing resilient microarchitectures with aggressive voltage margins. The thesis also presents a comparative study of voltage noise in CMPs consisting of either high-performant out-of-order cores and power-efficient in-order cores. The study highlights that the out-of-order cores experience much larger voltage variations when compared to the in-order cores, but offer a clear advantage in terms of performance. Experiments indicate that in-order configurations that offer equivalent performance to the out-of-order cores result in large energy-delay product, indicating the trade-offs involved in designing for performance, power and reliability. The thesis also presents a study of voltage noise in single-ISA heterogeneous configurations, to highlight the benefits of such systems towards lowering the worst-case voltage margins, which improve both performance and power. The experimental results indicate that the worst-case voltage droop in such heterogeneous systems lies in between the out-of-order and in-order cores and provide reasonable power-efficiency and performance. Further, the work highlights the importance of exploring the design-space of heterogeneous systems considering reliability as an important design criteria.Item Guardband management in heterogeneous architectures(2016-12) Leng, Jingwen; Janapa Reddi, Vijay; John, Lizy Kurian; Erez, Mattan; Fussell, Donald S.; Bose, PradipPerformance and power efficiency are two of the most critical aspects of computing systems. Moore's law (the doubling of transistors in a chip every 18 months), coupled with Dennard scaling, enabled a synergy between device, circuit, microarchitecture, and architecture to drive improvements in those two critical aspects. With the recent end of Dennard scaling, on-chip transistor count continues to increase, but the smaller transistor size no longer provides performance per power gain. The divergence between transistor density increases and power efficiency gain decreases results in processor design paradigm shifts from the single-core CPU architecture to the multicore or manycore CPU architecture, and eventually to the heterogeneous architecture. Besides performance and power efficiency, reliability is another crucial computing requirement. However, regardless of how the architecture evolves, processors still need to trade off a significant portion of performance or power efficiency to ensure reliability. When running on the silicon, processors experience continuously varying operating conditions, such as process, voltage, and temperature (PVT) variation. All the variation may slow down circuit speed and cause timing errors. The traditional approach to ensuring the reliable operation in the presence of possible worst-case conditions is to statically assign a large-enough voltage margin (or guardband). But such an approach leads to wasted energy, because the worst-case condition rarely occurs, and the processor could have operated at a lower voltage most of the time [36, 48, 77]. We need to actively manage the voltage guardband to fully unlock the efficiency potential of heterogeneous architectures. However, guardband management in heterogeneous architectures is a particularly challenging problem that has not been studied by prior work. On one hand, as transistors become smaller, the impact of PVT variation relative to the nominal voltage becomes more significant [60]. On the other hand, increasing core count in the processor results in a larger die area and a higher peak power consumption, both of which complicate and enlarge the impact of PVT variation. To this end, this thesis studies cross-layer mechanisms that span from the circuit to (micro)architecture to software runtime for managing the guardband in the heterogeneous architecture. Most prior works have studied guardband management mechanisms only in the circuit or (micro)architecture level. In comparison, my colleagues and I studied cross-layer mechanisms that require lower hardware design complexity and incur less implementation overhead because the software takes a major role in guardband management. Moreover, the cross-layer mechanisms alleviate the need for (micro)architecture-specific optimizations, which make them scalable solutions in the current era of rapidly evolving heterogeneous architectures. This thesis performs such a study in the manycore GPU architecture, which is a representative heterogeneous architecture and has been widely adopted in mainstream computing. The first part of the thesis focuses on the modeling and characterization of PVT variation in the GPU architecture. We first perform a thorough characterization of the underlying PVT variation's impact on the voltage guardband based on hardware measurements. After identifying voltage variation (noise) as the most challenging and necessary factor for guardband management, we study methodologies for how to accurately model voltage noise in the manycore architecture. The insights on how the circuit, microarchitecture, and program interact with each other to affect the PVT variation lay foundations for cross-layer guardband management mechanisms studied in this thesis. The second part of this thesis studies two guardband-management techniques and demonstrates that they can significantly improve the GPU architecture's energy efficiency. We first study how to improve the worst-case guardbanding design by performing voltage smoothing, which effectively mitigates large voltage noise and achieves significant energy savings with less guardband requirement. We then study how to adapt to the program-specific guardband requirement to fully unlock the current GPU's efficiency potential. We propose a mechanism called predictive guardbanding, in which the program directly predicts its voltage requirement. The proposed design leverages cross-layer optimization to minimize hardware complexity and overhead. The last part of this thesis studies reliability optimization when the prediction in the predictive guardbanding fails with an unexpected error margin. We advocate maintaining system-level reliability, and we propose a design paradigm called asymmetric resilience, whose principle is to develop the reliable heterogeneous CPU-GPU system centering around the CPU. This generic design paradigm eases the GPU away from reliability optimization. We present design principles and practices for the heterogeneous system that adopts such design paradigm. Following the principles of asymmetric resilience, we demonstrate how to use the CPU architecture to handle GPU execution errors, which lets the GPU focus on typical case operation for better energy efficiency. We explore the design space and demonstrate that it can be used as the safety-net mechanism in predictive guardbanding with reasonable overhead.