Approximate high-level synthesis of quality and energy optimized hardware processors
MetadataShow full item record
Approximate computing is a technique that exploits trade-offs between energy/performance and quality of computed results. Such techniques have been explored at various design levels for inherently error-tolerant applications, such as image/video processing and machine learning. At the hardware level, various components, such as arithmetic and logic units (ALUs) or memories have been proposed to build general processor or custom hardware designs. Existing work on designing custom approximate hardware processors has been mostly ad-hoc or using expensive iterative simulation and re-synthesis for design space exploration. In this dissertation, the focus is on a novel approximate high-level synthesis (AHLS) approach that utilizes approximate operators in synthesizing an energy-optimal register-transfer level (RTL) design from its high-level C description under overall quality constraints at the design's outputs. In effective AHLS, fast and accurate quality and energy models are required together with an optimization technique to efficiently find an optimal design. Quality effects of hardware approximations strongly depend on input data. Existing work either uses over-simplified models or relies on time-consuming simulations. By contrast, in this work, a statistical formulation is employed to capture input dependency and analytically estimate quality using one-time profiling only. Such a quality analysis is first presented for fine-grain bit length optimization of individual operations in a given design. The analysis approach is then further extended to support general hardware approximations using arbitrary adder and multiplier designs proposed in literature. Energy savings due to approximations stem from reductions in both switching activity and delay. The latter can be exploited for voltage scaling, but existing approaches do not fully exploit such opportunities. To include voltage scaling in energy savings, a novel approach is presented that estimates the performance impact of approximations while taking into account the tight interactions with existing scheduling and binding tasks in an overall high-level synthesis framework. Quality, performance, and energy estimation methods presented in this dissertation are further combined with a novel AHLS-specific optimization technique and heuristic solver that finds a near-optimal solutions efficiently in a breadth-first manner. Results show that quality estimation is 28 times faster than simulation-based approaches, and up to 24.5 % higher energy savings is achieved comparing to approaches that only consider switching activity. The heuristic solver is able to find Pareto-optimal solutions within 0.1 % compared to an exhaustive search, all while being up to 168 times faster. The models and AHLS flow are further extended to incorporate a novel loop approximation technique, which mainly targets performance improvements. Loops are often the most performance-critical application code structures, and in a loop, different iterations can have different impact on output quality due to the inherent data-dependency of approximations. Exploiting iteration-wise data variations, an approach is presented that clusters loop iterations according to data statistics and then applies different approximations in each cluster to maximize performance gains. Performance improvements by such selective loop approximation are shown to be up to 76 %, with up to 21.7 % stemming from clustering.