Learning-based analytical cross-platform performance and power prediction
MetadataShow full item record
Under growing complexity and time-to-market pressures of modern computer systems, agile co-development of both hardware and software is an essential aspect of system-level design. In order to meet stringent performance and power budgets, hardware and software designers are often required to explore large design spaces. With increasing complexities, fast and accurate performance and power prediction at early design stages thus becomes a key challenge. Existing analytical modeling approaches are fast, but make it hard to capture all the dynamic effects in modern architectures, which inherently limits their accuracies. By contrast, simulation-based techniques can accurately capture detailed low-level hardware interactions, but are often prohibitively slow. In this dissertation, we explore the question whether analytical and simulation benefits can be combined in a novel way to improve the speed and accuracy tradeoff. We present a novel learning-based approach for fast and accurate estimation of software performance and power consumption on a target hardware platform. We leverage the observation that performance and power consumption of software running on different platforms are closely correlated, and introduce the concept of cross-platform prediction. We demonstrate that it is possible to construct models that can accurately predict the performance and power of an application on a target platform while running the same application on a completely different host. We utilize state-of-the-art machine learning techniques to synthesize analytical proxy models that can accurately capture the performance and power correlation between platforms to perform such cross-platform predictions. We first introduce a coarse-grained cross-platform performance prediction methodology that accurately estimates the runtime of whole programs on a target platform from hardware counter measurements taken on a host. Results show that our approach achieves on average more than 90% accuracy at speeds of over 800 MIPS. We further develop a fine-grained prediction methodology that is able to predict the detailed temporal variations of both target performance and power. Compared to the coarse-grained approach, the fine-grained prediction achieves better average accuracy at over 97%, but requires source code instrumentation with speeds of over 500 MIPS. We finally extend our fine-grained approach and propose a sampling-based, binary-level cross-platform prediction method that eliminates the application source code requirement while at the same time improving simulation speed. At similar accuracies, it is up to 6X faster than the instrumentation-based fine-grained prediction approach.