Connected Components in Linear Work and Near-Optimal Time

Alireza Farhadi; S. Cliff Liu; Elaine Shi

Connected Components in Linear Work and Near-Optimal Time

Alireza Farhadi, S. Cliff Liu, Elaine Shi

TL;DR

The paper addresses the fundamental problem of computing connected components on large graphs using parallel RAM, aiming for sub-logarithmic parallel time while maintaining total work linear in the input size. It introduces a multi-stage framework that contracts the graph, densifies to raise minimum degree, and samples edges to preserve component connectivity, with a core dependence on the spectral gap $\lambda$ via the normalized Laplacian. The main contributions are a PRAM algorithm achieving $O(\log(1/\lambda) + \log\log n)$ time and $O(m+n)$ work with high probability (no prior knowledge of $\lambda$), plus a conditional lower bound of $\Omega(\log(1/\lambda))$ under the 2-Cycle Conjecture for $O(m+n)$-memory PRAM when $\lambda$ is small. The results connect and extend the MPC and PRAM connectivity literature, showing that well-connected components admit near-optimal sub-logarithmic parallel time with linear work, and provide robust techniques for graph densification and edge-sampling that preserve spectral properties. The work has implications for large-scale graph analytics where both parallel time and total resource usage are critical, and it advances the understanding of when sub-logarithmic-time parallel connectivity is achievable in classical PRAM models.

Abstract

Computing the connected components of a graph is a fundamental problem in algorithmic graph theory. A major question in this area is whether we can compute connected components in $o(\log n)$ parallel time. Recent works showed an affirmative answer in the Massively Parallel Computation (MPC) model for a wide class of graphs. Specifically, Behnezhad et al. (FOCS'19) showed that connected components can be computed in $O(\log d + \log \log n)$ rounds in the MPC model. More recently, Liu et al. (SPAA'20) showed that the same result can be achieved in the standard PRAM model but their result incurs $Θ((m+n) \cdot (\log d + \log \log n))$ work which is sub-optimal. In this paper, we show that for graphs that contain \emph{well-connected} components, we can compute connected components on a PRAM in sub-logarithmic parallel time with \emph{optimal}, i.e., $O(m+n)$ total work. Specifically, our algorithm achieves $O(\log(1/λ) + \log \log n)$ parallel time with high probability, where $λ$ is the minimum spectral gap of any connected component in the input graph. The algorithm requires no prior knowledge on $λ$. Additionally, based on the \textsc{2-Cycle} Conjecture we provide a time lower bound of $Ω(\log(1/λ))$ for solving connected components on a PRAM with $O(m+n)$ total memory when $λ\le (1/\log n)^c$, giving conditional optimality to the running time of our algorithm as a parameter of $λ$.

Connected Components in Linear Work and Near-Optimal Time

TL;DR

via the normalized Laplacian. The main contributions are a PRAM algorithm achieving

time and

work with high probability (no prior knowledge of

), plus a conditional lower bound of

under the 2-Cycle Conjecture for

-memory PRAM when

is small. The results connect and extend the MPC and PRAM connectivity literature, showing that well-connected components admit near-optimal sub-logarithmic parallel time with linear work, and provide robust techniques for graph densification and edge-sampling that preserve spectral properties. The work has implications for large-scale graph analytics where both parallel time and total resource usage are critical, and it advances the understanding of when sub-logarithmic-time parallel connectivity is achievable in classical PRAM models.

Abstract

Computing the connected components of a graph is a fundamental problem in algorithmic graph theory. A major question in this area is whether we can compute connected components in

parallel time. Recent works showed an affirmative answer in the Massively Parallel Computation (MPC) model for a wide class of graphs. Specifically, Behnezhad et al. (FOCS'19) showed that connected components can be computed in

rounds in the MPC model. More recently, Liu et al. (SPAA'20) showed that the same result can be achieved in the standard PRAM model but their result incurs

work which is sub-optimal. In this paper, we show that for graphs that contain \emph{well-connected} components, we can compute connected components on a PRAM in sub-logarithmic parallel time with \emph{optimal}, i.e.,

total work. Specifically, our algorithm achieves

parallel time with high probability, where

is the minimum spectral gap of any connected component in the input graph. The algorithm requires no prior knowledge on

. Additionally, based on the \textsc{2-Cycle} Conjecture we provide a time lower bound of

for solving connected components on a PRAM with

total memory when

, giving conditional optimality to the running time of our algorithm as a parameter of

Paper Structure (59 sections, 83 theorems, 71 equations)

This paper contains 59 sections, 83 theorems, 71 equations.

Introduction
Our Results and Contributions
Additional Related Work
Preliminaries
Definitions and Notations
Notations for undirected graph.
Labeled digraph.
Additional notations.
Contraction algorithms.
Preliminaries on Spectral Graph Theory
Previous Results
Technical Overview
Stage $1$: Contract the Graph to Reduce Vertices
Stage $2$: Increase the Minimum Degree
Creating a skeleton graph.
...and 44 more sections

Key Result

Theorem 1

There is an ARBITRARY CRCW PRAM algorithm that computes connectivity in $O(\log (1 / \lambda) + \log \log n)$ time and $O(m + n)$ work with high probability, where $\lambda$ is the minimum spectral gap of any connected component in the input graph.Our algorithm requires no prior knowledge on $\lambd

Theorems & Definitions (170)

Theorem 1
Definition 2.1: Normalized Laplacian matrix
Definition 2.2: Spectral gap
Definition 2.3: Conductance
Theorem 2: liu2020connected
Definition 4.1
Lemma 4.2: DBLP:conf/focs/Goodrich91
Lemma 4.3
proof
Lemma 4.4
...and 160 more

Connected Components in Linear Work and Near-Optimal Time

TL;DR

Abstract

Connected Components in Linear Work and Near-Optimal Time

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (170)