Analysis and design of resource allocation policies for cloud-based computing systems supporting soft real-time applications
MetadataShow full item record
Cloud-based computing infrastructure can provide an efficient means to support real-time applications with compute and/or communication deadlines, e.g., virtualized base station processing, and collaborative video conferencing. In many cases, such applications can tolerate occasional deadline violations without substantially impacting their Quality of Service (QoS). A fundamental problem in such systems is deciding how to allocate shared resources so as to meet applications' QoS requirements. A simple framework to address this problem is to, (1) dynamically prioritize users as a possibly complex function of their deficits (difference of achieved vs required QoS), and (2) allocate resources so as to expedite users assigned higher priority. In the first part of this dissertation, we focus on a general class of systems using such priority-based resource allocation. In this setting we characterize the set of feasible QoS requirements. We then consider "simple" weighted Largest Deficit First (w-LDF) prioritization policies, where users with higher weighted deficits are given higher priority. We give an inner bound for the feasible set of QoS requirements under w-LDF policies, and characterize its geometry under an additional monotonicity assumption. Additional insights on the optimality of LDF/hierarchical-LDF are also discussed. In the second part of this dissertation, we consider a specific class of computing systems having multiple uniform resources. We develop a general outer bound on the feasible QoS region for non-clairvoyant resource allocation policies, and study the efficiency and near-optimality of two natural resource allocation policies: (1) priority-based greedy task scheduling for applications with variable workloads, and (2) priority-based task selection and optimal scheduling for applications with deterministic workloads. Analysis and simulations show substantial resource savings for such policies over reservation-based designs. We also discuss user/stream management for computing systems supporting soft real-time users. Overall, the main contribution of this dissertation is a theoretical study on the efficiency and optimality of simple deficit-based resource allocation policies for systems supporting periodically generated, but stochastic workloads requiring soft guarantees on completion times.