Table of Contents
Fetching ...

Polynomial Pass Semi-Streaming Lower Bounds for K-Cores and Degeneracy

Sepehr Assadi, Prantar Ghosh, Bruno Loff, Parth Mittal, Sagnik Mukhopadhyay

TL;DR

The paper establishes polynomial-pass lower bounds for exact k-core computation and graph degeneracy in semi-streaming, answering whether natural not-too-hard problems can resist near-linear-pass solutions. The authors develop a reduction framework from MultiHPC to degeneracy, augmented by optimal HC/HPC bounds based on triangular discrimination, to show that any p-pass semi-streaming algorithm must use space Ω($n^2/p^3$) (thus Ω($n^{1/3}$) for semi-streaming). They also provide near-linear two-party communication upper bounds for degeneracy/k-core via carefully designed peeling-based protocols, proving tightness of the lower bounds. A central technical advance is replacing TV distance with positive triangular discrimination to close the long-standing gap in pointer-chasing style lower bounds, enabling stronger polynomial-pass results. Together, these results reveal natural graph problems that inherently require many passes in semi-streaming despite near-linear communication possibilities, and they introduce robust reduction machinery applicable to other not-too-hard problems.

Abstract

The following question arises naturally in the study of graph streaming algorithms: "Is there any graph problem which is "not too hard", in that it can be solved efficiently with total communication (nearly) linear in the number $n$ of vertices, and for which, nonetheless, any streaming algorithm with $\tilde{O}(n)$ space (i.e., a semi-streaming algorithm) needs a polynomial $n^{Ω(1)}$ number of passes?" Assadi, Chen, and Khanna [STOC 2019] were the first to prove that this is indeed the case. However, the lower bounds that they obtained are for rather non-standard graph problems. Our first main contribution is to present the first polynomial-pass lower bounds for natural "not too hard" graph problems studied previously in the streaming model: $k$-cores and degeneracy. We devise a novel communication protocol for both problems with near-linear communication, thus showing that $k$-cores and degeneracy are natural examples of "not too hard" problems. Indeed, previous work have developed single-pass semi-streaming algorithms for approximating these problems. In contrast, we prove that any semi-streaming algorithm for exactly solving these problems requires (almost) $Ω(n^{1/3})$ passes. Our second main contribution is improved round-communication lower bounds for the underlying communication problems at the basis of these reductions: * We improve the previous lower bound of Assadi, Chen, and Khanna for hidden pointer chasing (HPC) to achieve optimal bounds. * We observe that all current reductions from HPC can also work with a generalized version of this problem that we call MultiHPC, and prove an even stronger and optimal lower bound for this generalization. These two results collectively allow us to improve the resulting pass lower bounds for semi-streaming algorithms by a polynomial factor, namely, from $n^{1/5}$ to $n^{1/3}$ passes.

Polynomial Pass Semi-Streaming Lower Bounds for K-Cores and Degeneracy

TL;DR

The paper establishes polynomial-pass lower bounds for exact k-core computation and graph degeneracy in semi-streaming, answering whether natural not-too-hard problems can resist near-linear-pass solutions. The authors develop a reduction framework from MultiHPC to degeneracy, augmented by optimal HC/HPC bounds based on triangular discrimination, to show that any p-pass semi-streaming algorithm must use space Ω() (thus Ω() for semi-streaming). They also provide near-linear two-party communication upper bounds for degeneracy/k-core via carefully designed peeling-based protocols, proving tightness of the lower bounds. A central technical advance is replacing TV distance with positive triangular discrimination to close the long-standing gap in pointer-chasing style lower bounds, enabling stronger polynomial-pass results. Together, these results reveal natural graph problems that inherently require many passes in semi-streaming despite near-linear communication possibilities, and they introduce robust reduction machinery applicable to other not-too-hard problems.

Abstract

The following question arises naturally in the study of graph streaming algorithms: "Is there any graph problem which is "not too hard", in that it can be solved efficiently with total communication (nearly) linear in the number of vertices, and for which, nonetheless, any streaming algorithm with space (i.e., a semi-streaming algorithm) needs a polynomial number of passes?" Assadi, Chen, and Khanna [STOC 2019] were the first to prove that this is indeed the case. However, the lower bounds that they obtained are for rather non-standard graph problems. Our first main contribution is to present the first polynomial-pass lower bounds for natural "not too hard" graph problems studied previously in the streaming model: -cores and degeneracy. We devise a novel communication protocol for both problems with near-linear communication, thus showing that -cores and degeneracy are natural examples of "not too hard" problems. Indeed, previous work have developed single-pass semi-streaming algorithms for approximating these problems. In contrast, we prove that any semi-streaming algorithm for exactly solving these problems requires (almost) passes. Our second main contribution is improved round-communication lower bounds for the underlying communication problems at the basis of these reductions: * We improve the previous lower bound of Assadi, Chen, and Khanna for hidden pointer chasing (HPC) to achieve optimal bounds. * We observe that all current reductions from HPC can also work with a generalized version of this problem that we call MultiHPC, and prove an even stronger and optimal lower bound for this generalization. These two results collectively allow us to improve the resulting pass lower bounds for semi-streaming algorithms by a polynomial factor, namely, from to passes.
Paper Structure (41 sections, 35 theorems, 85 equations, 6 figures)

This paper contains 41 sections, 35 theorems, 85 equations, 6 figures.

Key Result

corollary 1.1

For any integer $p \geqslant 1$, any $p$-pass streaming algorithm for the following problems on $n$-vertex graphs requires $\Omgt{n^2/p^3}$ space. In particular, any semi-streaming algorithm for these problems require $\Omgt{n^{1/3}}$ passes.

Figures (6)

  • Figure 1: The gadget graph $G$ for the following instance of $\textnormal{BMHPC}\xspace_{3,3}$. ${\mathbf A}^1 = \{A^1_{x_1} = \{y_1,y_2\}, A^1_{x_2} = \{y_1,y_2\}, A^1_{x_3} = \{y_2\}\}$. ${\mathbf B}^1 =\{ B^1_{x_1} = \{y_2, y_3\}, B^1_{x_2} = \{y_1\}, B^1_{x_3} = \{y_2, y_3\}\}$. ${\mathbf C}^2= \{C^2_{y_1} = \{x_1\}, C^2_{y_2} = \{x_1,x_2\}, C^2_{y_3} = \{x_3\}\}$. ${\mathbf D}^2 = \{D^2_{y_1} = \{x_1,x_2\}, D^2_{y_2} = \{x_2,x_3\}, D^2_{y_3} = \{x_2,x_3\}\}$. ${\mathbf A}^3 = \{A^3_{x_1} = \{y_2\}, A^3_{x_2} = \{y_2,y_3\}, A^3_{x_3} = \{y_1\}\}$. ${\mathbf B}^3 =\{ B^3_{x_1} = \{y_1, y_2\}, B^3_{x_2} = \{y_3\}, B^3_{x_3} = \{y_1, y_3\}\}$ (we skip ${\mathbf C}^1,{\mathbf D}^1,{\mathbf A}^2,{\mathbf B}^2,{\mathbf C}^3,{\mathbf D}^3$ because they are redundant). Hence, $z_0=x_1$; $z_1=t^1_{x_1}=y_2$; $z_2=t^2_{y_2}=x_2$; and $z_3=t^3_{x_2}=y_3$. Therefore, the answer to this instance is $b(y_3)=1$. Here, a double line between two sets of nodes (e.g., the one between sets $S$ and $L_0$) signifies that all possible cross edges are present between those two sets. A single line between two sets (e.g., the one between $V_{\text{aux}}$ and $L_0$) denotes that some edges maybe present between them. A dashed line (e.g., the one between sets $S$ and $Q$) denotes that absolutely no edge is present between them. For each triple, the number (in red) inside the box containing it denotes the degree of each node in the triple. The red and blue edges encode the inputs. The thick edges are between triples representing $z_i$ and $z_{i+1}$.
  • Figure 2: The status of the graph just before iteration $4$. $Z_0$ was deleted in the last three iterations, and hence crossed out. The dotted lines denote edges that were deleted as a result. A double dotted-line between two sets of nodes signifies that originally all cross edges were present between them. Next to a node, its current degree is given. If all nodes in a set have the same degree, the value is given next to the set.
  • Figure 3: The status of the graph just before iteration $10$. $Z_2$ was deleted in the last three iterations, and hence crossed out. The dotted lines denote edges that were deleted as a result. A double dotted-line between two sets of nodes signifies that originally all cross edges were present between them. Next to a node, its current degree is given. If all nodes in a set have the same degree, the value is given next to the set.
  • Figure 4: The status of the graph just before iteration $7$. $Z_1$ was deleted in the last three iterations, and hence crossed out. The dotted lines denote edges that were deleted as a result. A double dotted-line between two sets of nodes signifies that originally all cross edges were present between them. Next to a node, its current degree is given. If all nodes in a set have the same degree, the value is given next to the set.
  • Figure 5: The graph after iteration $6r+3$ for the case $b(z_r)=0$. A double line between two sets of nodes signifies that all possible cross edges are present between them. A dashed line denotes that no edge is present between them.
  • ...and 1 more figures

Theorems & Definitions (87)

  • corollary 1.1
  • theorem 2.1
  • theorem 2.2
  • definition 3.1
  • proposition 3.2
  • definition 3.3
  • definition 3.4
  • definition 3.5: Round of $\textnormal{HPC}\xspace_{m,r}$
  • definition 3.6
  • definition 3.7: Positive triangular discrimination
  • ...and 77 more