Models of streaming computation
Access full-text files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Streaming algorithms, which process very large datasets received one update at a time, are a key tool in large-scale computing. This thesis studies the theory of streaming algorithms, and in particular the implications of how we model streaming algorithms. In constructing such a model, two questions arise: • What kinds of stream must the algorithm handle? • What is the algorithm allowed to do?
We describe new streaming algorithms and prove new lower bounds in various streaming models, illustrating how the answers to these questions change which kinds of streaming problem are tractable. These include: • New algorithms for counting triangles in the insertion-only and linear sketching models, along with tight (up to log factors) lower bounds. • A quantum streaming algorithm for counting triangles, giving the first quantum advantage for a natural one-pass streaming problem. • A complete characterization of the linear sketching complexity of subgraph counting problems in constant-degree graphs. • An exponential separation between linear sketching and turnstile streaming under natural restrictions on the stream, complementing a prior series of work [Gan08, LNW14, AHLW16] that establishes equivalences between these models in the absence of such restrictions. • A new model of random-order streaming in which some correlations are allowed between edge arrival times, along with tight (up to polylog factors) upper and lower bounds for the problem of finding components in this model.
A common theme in many of these results is the connection between sampling algorithms and lower bounds derived from the techniques of Boolean Fourier analysis. We give new methods for extending these techniques to settings such as linear sketching and random-order streaming.