Accelerating virtualization of accelerators

Yu, Hangchen

Accelerating virtualization of accelerators

Access full-text files

YU-DISSERTATION-2020.pdf (1.7 MB)

Date

2021-01-21

Authors

Yu, Hangchen

Abstract

The use of specialized accelerators is among the most promising paths to better energy efficiency for computationally heavy workloads. However, current software and system support for accelerators is limited, and no production-ready solutions have yet been provided for accelerators to be efficiently accessed or shared in domains such as cloud infrastructure and kernel space. Complex hardware and proprietary software stacks inhibit efficient accelerator virtualization. We observe that practical virtualization has to choose between interposition at the topmost (user API) and bottom-most (hardware) interfaces, and virtualization based on interposing intermediate stack layers is impractical.

Based on these observations, this thesis first presents AvA (Accelerated Virtualization of Accelerators) which exposes practical virtual accelerators in the cloud with strong virtualization properties such as isolation, compatibility, and consolidation. AvA is the first system to show general techniques for API remoting that retain both hypervisor interposition and close-to-native performance, and is the first system for automatic construction of virtual accelerator stacks with hypervisor mediation for arbitrary accelerators. We used AvA to virtualize nine accelerators and eleven framework APIs, with orders-of-magnitude lower programming effort than required to construct hand-built virtualization support. These accelerators include seven for which no virtualization support has been previously explored.

Building on AvA, this thesis presents Akatha (Accelerating Kernel Access to Hardware Acceleration), which uses automation to reduce developer effort in building efficient access to accelerators for kernel-level work (e.g., FS encryption or packet processing). Akatha constructs API-remoting-based kernel accelerator stacks with code generation, leveraging kernel knowledge unavailable in user space to improve performance and resource management. This includes transparently modifying virtual memory mappings to avoid data transfer between kernel and user space, and providing a framework and mechanisms to manage contention between user and kernel for accelerator devices. We evaluated Akatha with a range of workloads, showing promising opportunities for OS acceleration.