Constructing adaptable and scalable synthetic benchmarks for microprocessor performance evaluation
MetadataShow full item record
Benchmarks set standards for innovation in computer architecture research and industry product development. Consequently, it is of paramount importance that the benchmarks used in computer architecture research and development are representative of real-world applications. However, composing such representative workloads poses practical challenges to application analysis teams and benchmark developers - (1) Benchmarks that are standardized are open-source whereas applications of interest are typically proprietary, (2) Benchmarks are rigid, measure single-point performance, and only represent a sample of the application behavior space (3) Benchmark suites take several years to develop, but applications evolve at a faster rate, and (4) Benchmarks geared towards temperature and power characterization are difficult to develop and standardize. The objective of this dissertation is to develop an adaptive benchmark generation strategy to construct synthetic benchmarks to address these benchmarking challenges. We propose an approach for automatically distilling key hardware-independent performance attributes of a proprietary workload and capture them into a miniature synthetic benchmark clone. The advantage of the benchmark clone is that it hides the functional meaning of the code, but exhibits similar performance and power characteristics as the target application across a wide range of microarchitecture configurations. Moreover, the dynamic instruction count of the synthetic benchmark clone is substantially shorter than the proprietary application, greatly reducing overall simulation time -- for the SPEC CPU 2000 suite, the simulation time reduction is over five orders of magnitude compared to the entire benchmark execution. We develop an adaptive benchmark generation strategy that trades off accuracy to provide the flexibility to easily alter program characteristics. The parameterization of workload metrics makes it possible to succinctly describe an application's behavior using a limited number of fundamental program characteristics. This provides the ability to alter workload characteristics and construct scalable benchmarks that allows researchers to explore a wider range of the application behavior space, conduct program behavior studies, and model emerging workloads. The parameterized workload model is the foundation for automatically constructing power and temperature oriented synthetic workloads. We show that machine learning algorithms can be effectively used to search the application behavior space to automatically construct benchmarks for evaluating the power and temperature characteristics of a computer architecture design. The need for a scientific approach to construct synthetic benchmarks, to complement application benchmarks, has long been recognized by the computer architecture research community, and this dissertation work is a significant step towards achieving that goal.