Browsing by Subject "High Performance Computing"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Memory Bandwidth and System Balance in HPC Systems(2016-11-16) McCalpin, John DavidThe “Attack of the Killer Micros” began approximately 25 years ago as microprocessor-based systems began to compete with supercomputers (in some application areas). It became clear that peak arithmetic rate was not an adequate measure of system performance for many applications, so in 1991 Dr. McCalpin introduced the STREAM Benchmark to estimate “sustained memory bandwidth” as an alternative performance metric. STREAM apparently embodied a good compromise between generality and ease of use and quickly became the “de facto” standard for measuring and reporting sustained memory bandwidth in High Performance Computing systems. Since the initial “attack”, Moore’s Law and Dennard Scaling have led to astounding increases in the computational capabilities of microprocessors. The technology behind memory subsystems has not experienced comparable performance improvements, causing sustained memory bandwidth to fall behind. This talk reviews the history of the changing balances between computation, memory latency, and memory bandwidth in deployed HPC systems, then discusses how the underlying technology changes led to these market shifts. Key metrics are the exponentially increasing relative performance cost of memory accesses and the massive increases in concurrency that are required to obtain increased memory throughput. New technologies (such as stacked DRAM) allow more pin bandwidth per package, but do not address the architectural issues that make high memory bandwidth expensive to support. Potential disruptive technologies include near-memory-processing and application-specific system implementations, but all foreseeable approaches fail to provide software compatibility with current architectures. Due to the absence of practical alternatives, in the near term we can expect systems to become increasingly complex and unbalanced, with constant or slightly increasing per-node prices. These systems will deliver the best rate of performance improvement for workloads with increasingly high compute intensity and increasing available concurrency.Item The Path to Exascale(2019) Stanzione, DanItem Trends in System Cost and Performance Balances and Implications for the Future of HPC(2015-11-16) McCalpin, John DavidFor the last decade, HPC systems have been dominated by clusters of two-socket commodity x86 servers, typically equipped with a non-commodity high-performance interconnect. Trends in lifecycle costs and prices, hardware technology, several measures of CPU and memory performance, and application performance characteristics are presented using several non-traditional perspectives. The evolution of the various "balances" of the systems over time is discussed --- both in the context of the interaction of application performance with the changing hardware, and in the context of the broader economic environment. Several serious obstacles to maintaining previous performance growth rates are identified and discussed, and it is argued that these are better viewed as architectural and market issues, rather than as fundamental technology issues. It is argued that overcoming these obstacles will require a fundamentally different approach to hardware architecture and programming languages, as well as to system configuration, deployment, and allocation strategies.