Table of Contents
Fetching ...

Towards Scalable and Practical Batch-Dynamic Connectivity

Quinten De Man, Laxman Dhulipala, Adam Karczmarz, Jakub Łącki, Julian Shun, Zhongqi Wang

TL;DR

This work gives the first parallel algorithm for the problem of dynamically maintaining the connected components of an undirected graph subject to edge insertions and deletions that is work-efficient, supports batches of updates, runs in polylogarithmic depth, and uses only linear total space.

Abstract

We study the problem of dynamically maintaining the connected components of an undirected graph subject to edge insertions and deletions. We give the first parallel algorithm for the problem which is work-efficient, supports batches of updates, runs in polylogarithmic depth, and uses only linear total space. The existing algorithms for the problem either use super-linear space, do not come with strong theoretical bounds, or are not parallel. On the empirical side, we provide the first implementation of the cluster forest algorithm, the first linear-space and poly-logarithmic update time algorithm for dynamic connectivity. Experimentally, we find that our algorithm uses up to 19.7x less space and is up to 6.2x faster than the level-set algorithm of HDT, arguably the most widely-implemented dynamic connectivity algorithm with strong theoretical guarantees.

Towards Scalable and Practical Batch-Dynamic Connectivity

TL;DR

This work gives the first parallel algorithm for the problem of dynamically maintaining the connected components of an undirected graph subject to edge insertions and deletions that is work-efficient, supports batches of updates, runs in polylogarithmic depth, and uses only linear total space.

Abstract

We study the problem of dynamically maintaining the connected components of an undirected graph subject to edge insertions and deletions. We give the first parallel algorithm for the problem which is work-efficient, supports batches of updates, runs in polylogarithmic depth, and uses only linear total space. The existing algorithms for the problem either use super-linear space, do not come with strong theoretical bounds, or are not parallel. On the empirical side, we provide the first implementation of the cluster forest algorithm, the first linear-space and poly-logarithmic update time algorithm for dynamic connectivity. Experimentally, we find that our algorithm uses up to 19.7x less space and is up to 6.2x faster than the level-set algorithm of HDT, arguably the most widely-implemented dynamic connectivity algorithm with strong theoretical guarantees.

Paper Structure

This paper contains 34 sections, 41 theorems, 2 equations, 7 figures, 3 tables, 3 algorithms.

Key Result

lemma 1

Suppose Invariant inv:blocked holds for a cluster graph $CG(c)$ of a level $i$ cluster $c$. Let $M$ be the size of the maximum matching in $CG(c)$ over only the blocked edges. Then $M \leq 1$.

Figures (7)

  • Figure 1: The core data structures used by the cluster forest (CF) and HDT algorithms. The input graph is shown in (a). The cluster forest is given in (b), and represents the nested hierarchy of connected components. The leaves (level 0 nodes) are the original vertices. The internal nodes (level $> 0$) are colored using different colors per-level and are the nested components. (c) shows the cluster graph of the level 2 component $F$. The nodes in the cluster graph are the children of $F$, and the edges in the cluster graph are the level $2$ edges that are incident to vertices that $F$ contains (the level $2$ edges go between level $1$ components). Lastly, (d) shows the same component hierarchy as (b), but as stored by the HDT algorithm. The HDT algorithm stores a separate Euler Tour Tree (ETT) for every component in the hierarchy. Each tree edge is stored twice (illustrated as smaller white circles), once per direction.
  • Figure 2: (1): The cluster graph of the level $i$ cluster $P$ containing a deleted level $i$ edge $(u,v)$; the vertices of the cluster graph are the level $(i-1)$ child clusters of $P$. (2): Deleting $(u,v)$ may disconnect $CG(P)$, so $\mathsf{search}(C_u)$ and $\mathsf{search}(C_v)$ are run to check if $C_u$ and $C_v$ are still connected using level $i$ edges in $CG(P)$. Green edges are edges explored during the search. The size of all clusters explored by the smaller search must have size $\leq 2^{ i-1 }$, and can be merged into a single level $(i-1)$ cluster. (3): In this example, $CG(P)$ remains connected, $\{C_u,C_a,C_b\}$ are merged, and the explored level $i$ edges of the smaller search are pushed to level $(i-1)$.
  • Figure 3: The three possible cases for what the cluster graph of a level $i$ cluster in a blocked cluster-forest can look like. Red edges are blocked edges, and blue edges are unblocked edges.
  • Figure 4: Illustration of the cluster graph for a node in the blocked cluster-forest, and its star structure. The center node of $CG(P)$ is $C_3$, and all other nodes are satellites connected to the center through a blocked edge.
  • Figure 5: A cluster graph before and after the center cluster $P$ is split. The edges incident to $P$ are now split between $P_1$ and $P_2$. To restore the blocked invariant after splitting the center, the algorithm carefully fetches edges out of each satellite cluster.
  • ...and 2 more figures

Theorems & Definitions (42)

  • definition 1: Blocked Edge
  • lemma 1
  • lemma 2
  • lemma 3
  • lemma 4
  • lemma 5
  • lemma 6
  • lemma 7
  • lemma 8
  • Theorem 5.2
  • ...and 32 more