Policy architecture for distributed storage systems
MetadataShow full item record
Distributed data storage is a building block for many distributed systems such as mobile file systems, web service replication systems, enterprise file systems, etc. New distributed data storage systems are frequently built as new environment, requirements or workloads emerge. The goal of this dissertation is to develop the science of distributed storage systems by making it easier to build new systems. In order to achieve this goal, it proposes a new policy architecture, PADS, that is based on two key ideas: first, by providing a set of common mechanisms in an underlying layer, new systems can be implemented by defining policies that orchestrate these mechanisms; second, policy can be separated into routing and blocking policy, each addresses different parts of the system design. Routing policy specifies how data flow among nodes in order to meet performance, availability, and resource usage goals, whereas blocking policy specifies when it is safe to access data in order to meet consistency and durability goals. This dissertation presents a PADS prototype that defines a set of distributed storage mechanisms that are sufficiently flexible and general to support a large range of systems, a small policy API that is easy to use and captures the right abstractions for distributed storage, and a declarative language for specifying policy that enables quick, concise implementations of complex systems. We demonstrate that PADS is able to significantly reduce development effort by constructing a dozen significant distributed storage systems spanning a large portion of the design space over the prototype. We find that each system required only a couple of weeks of implementation effort and required a few dozen lines of policy code.