Large deviations analysis of scheduling policies for a web server
MetadataShow full item record
With increasing demand and availability of bandwidth resources, there has been tremendous growth in the scale and speed of web servers. In web servers, scheduling plays an important role in resource allocation (for instance, bandwidth allocation, processor allocation, etc). However, as the scale of a system increases so does the number of activities/events in the system (e.g., job arrivals), as a consequence of which the analysis of scheduling becomes increasingly harder. In particular, the possible ways in which scheduling failure (e.g., queue overflow, excessively large delay, instability of a system) can occur becomes increasingly greater, thus making it more difficult to understand the behavior and develop design rules for scheduling algorithms. However, a well-known observation from large devi viations theory that large scale systems fails in a “most likely way” can potentially be used to simplify the design and analysis of scheduling. In this thesis, we study the implications and applications of this effect on scheduling in a web server accessed by a large number of sources. We analyze the delay distribution of scheduling policies for web servers under a many sources large deviation regime which models web servers in a large scale system well. Due to the difficulties brought on considering a large number of sources, only a small number of scheduling policies, such as First-Come-First-Serve (FCFS), General-ProcessorSharing (GPS), and Priority Queueing policies have been analyzed under the many sources regime. In particular, in a single queue single server setup the delay characteristics of only FCFS, Shortest-Job-First (SJF), and Longest-Job-First (LJF) has been analyzed. In this thesis, we study the Two-Dimensional-Queueing (2DQ) framework, a unifying queueing framework that allows the identification of the “most likely way” in which delay occurs, to analyze the delay of various unexplored scheduling policies. In conjunction with the 2DQ framework, we develop a new “cycle based” technique for understanding the large deviations tail probability of more complex policies. Using the combination of the 2DQ framework and the cycle based analysis, we first analyze two interesting scheduling policies, i.e., Shortest-Remaining-Processing-Time (SRPT) policy (which is mean delay optimal) and Processer-Sharing (PS) policy (which is a “fair” policy). We derive the asymptotic delay distributions (rate functions) of both policies and study their behavior across job sizes. Next, we address three problems in implementing the aforementioned scheduling policies: (i) end receivers may have bandwidth constraints that are not taken account in SRPT, (ii) the remaining processing time information might not be available to the web-server, and (iii) most actual implementations are variants of SRPT to reflect other implementation constraints and/or to jointly optimize other metrics in addition to delay, i.e., jitter, fairness, etc. To address these, we first develop finite-SRPT that takes into account the bandwidth constraint at the end receiver, and show that the policy shifts between SRPT and a PS-like policy depending on the bandwidth constraint. Second, we study the Least-Attained-Service (LAS) policy which is viewed as a good substitute for SRPT when the remaining job size is not available and we analyze the penalty associated with not using the remaining size information directly. Lastly, we analyze a class of scheduling policies known as SMART that contains many variants of SRPT with different fairness properties and show that all policies in the class have the same tail probability of delay across job sizes for a many sources regime. The results of this thesis facilitate the understanding of various scheduling policies under the many sources regime and provides an analytical queueing framework that can be used to understand other scheduling policies.