xBFT : Byzantine fault tolerance with high performance, low cost, and aggressive fault isolation

Repository

xBFT : Byzantine fault tolerance with high performance, low cost, and aggressive fault isolation

Show simple record

dc.contributor.advisor Dahlin, Michael
dc.creator Kotla, Ramakrishna Rao, 1976-
dc.date.accessioned 2012-09-24T18:21:33Z
dc.date.available 2012-09-24T18:21:33Z
dc.date.created 2008-05
dc.date.issued 2012-09-24
dc.identifier.uri http://hdl.handle.net/2152/17979
dc.description.abstract We are increasingly relying on online services to store, access, share, and disseminate critical information from anywhere and at all times. Such services include email, digital storage, photos, video, health and financial services, etc. With increasing evidence of non-fail-stop failures in practical systems, Byzantine fault tolerant state machine replication technique is becoming increasingly attractive for building highlyreliable services in order to tolerate such failures. However, existing Byzantine fault tolerant techniques fall short of providing high availability, high performance, and long-term data durability guarantees with competitive replication cost. In this dissertation, we present BFT replication techniques that facilitate the design and implementation of such highly-reliable services by providing high availability, high performance and high durability with competitive replication cost (hardware, software, network, management). First, we propose CBASE, a BFT state machine replication architecture that leverages application-level parallelism to improve throughput of the replicated system by identifying and executing independent requests concurrently. Traditional state machine replication based Byzantine fault tolerant (BFT) techniques provide high availability and security but fail to provide high throughput. This limitation stems from the fundamental assumption of generalized state machine replication techniques that all replicas execute requests sequentially in the same total order to ensure consistency across replicas. Our architecture thus provides a general way to exploit application parallelism in order to provide high throughput without compromising correctness. Second, we present Zyzzyva, an efficient BFT agreement protocol that uses speculation to significantly reduce the performance overhead and replication cost of BFT state machine replication. In Zyzzyva, replicas respond to a client’s request without first running an expensive three-phase commit protocol to reach agreement on the order in which the request must be processed. Instead, they optimistically adopt the order proposed by the primary and respond immediately to the client. Replicas can thus become temporarily inconsistent with one another, but clients detect inconsistencies, help correct replicas converge on a single total ordering of requests, and only rely on responses that are consistent with this total order. This approach allows Zyzzyva to reduce replication overheads to near their theoretical minima. Third, we design and implement SafeStore, a distributed storage system designed to maintain long-term data durability despite conventional hardware and software faults, environmental disruptions, and administrative failures caused by human error or malice. The architecture of SafeStore is based on fault isolation, which SafeStore applies aggressively along administrative, physical, and temporal dimensions by spreading data across autonomous storage service providers (SSPs). SafeStore also performs an efficient end-to-end audit of SSPs to detect data loss quickly and improve data durability by reducing MTTR. SafeStore offers durable storage with cost, performance, and availability competitive with traditional storage systems. We evaluate these techniques by implementing BFT replication libraries and further demonstrate the practicality of these approaches by implementing an NFS based replicated file system(CBASE-FS) and a durable storage system (SafeStore-FS).
dc.format.medium electronic
dc.language.iso eng
dc.rights Copyright © is held by the author. Presentation of this material on the Libraries' web site by University Libraries, The University of Texas at Austin was made possible under a limited license grant from the author who has retained all copyrights in the works.
dc.subject.lcsh Fault-tolerant computing
dc.subject.lcsh High performance computing
dc.title xBFT : Byzantine fault tolerance with high performance, low cost, and aggressive fault isolation
dc.description.department Electrical and Computer Engineering
dc.type.genre Thesis
dc.type.material text
thesis.degree.department Electrical and Computer Engineering
thesis.degree.discipline Electrical and Computer Engineering
thesis.degree.grantor The University of Texas at Austin
thesis.degree.level Doctoral
thesis.degree.name Doctor of Philosophy

Files in this work

Download File: kotlar12231.pdf
Size: 932.7Kb
Format: application/pdf

This work appears in the following Collection(s)

Show simple record


Advanced Search

Browse

My Account

Statistics

Information