Deterministic Independent Sets in the Semi-Streaming Model
Daniel Ye
TL;DR
The paper proves a near-tight lower bound for deterministic single-pass semi-streaming MIS: any deterministic algorithm using $\tilde O(n)$ memory cannot output an independent set larger than $\tilde O\left(\frac{n}{\Delta^2}\right)$ for graphs with maximum degree $\Delta$, establishing a strong separation from randomized approaches that achieve $\Theta\left(\frac{n}{\Delta+1}\right)$. The authors develop a multi-party communication framework and a missing-graph compression lemma, then introduce a Turán-type adversary and clique-removal techniques to bound the information a deterministic algorithm can leverage. The core technical tools include removing large cliques in the missing graph through low-degree subgraphs, a careful Split decomposition for general graphs, and a structured adversary that ensures the input remains hard to summarize. Collectively, these methods illuminate the fundamental limits of derandomization in the semi-streaming MIS setting and guide future deterministic algorithm design or lower-bound proofs.
Abstract
We consider the independent set problem in the semi-streaming model. For any input graph $G=(V, E)$ with $n$ vertices, an independent set is a set of vertices with no edges between any two elements. In the semi-streaming model, $G$ is presented as a stream of edges and any algorithm must use $\tilde O(n)$ bits of memory to output a large independent set at the end of the stream. Prior work has designed various semi-streaming algorithms for finding independent sets. Due to the hardness of finding maximum and maximal independent sets in the semi-streaming model, the focus has primarily been on finding independent sets in terms of certain parameters, such as the maximum degree $Δ$. In particular, there is a simple randomized algorithm that obtains independent sets of size $\frac n{Δ+1}$ in expectation, which can also be achieved with high probability using more complicated algorithms. For deterministic algorithms, the best bounds are significantly weaker. In fact, the best we currently know is a straightforward algorithm that finds an $\tildeΩ\left(\frac n{Δ^2}\right)$ size independent set. We show that this straightforward algorithm is nearly optimal by proving that any deterministic semi-streaming algorithm can only output an $\tilde O\left(\frac n{Δ^2}\right)$ size independent set. Our result proves a strong separation between the power of deterministic and randomized semi-streaming algorithms for the independent set problem.
