Table of Contents
Fetching ...

Sublinear-Time Sampling of Spanning Trees in the Congested Clique

Sriram V. Pemmaraju, Sourya Roy, Joshua Z. Sobel

TL;DR

The paper presents the first sublinear-round algorithm for approximately sampling a uniform spanning tree in the Congested Clique, achieving $ ilde{O}(n^{1/2+\alpha})$ rounds to within total variation $\epsilon=\Omega(1/n^{c})$ (for any fixed $c>0$) by combining a top-down random-walk filling strategy with Schur-complement shortcut graphs and a compression-based reconstruction via weighted perfect matchings. It leverages distributed matrix multiplication (with exponent $\alpha$ currently $0.157$) to build and manipulate the derivative graphs, enabling $O(\sqrt{n})$ phases each costing $\tilde{O}(n^{\alpha})$ rounds, and thus yields a sublinear-time sampler. The authors also show how to achieve shorter walks efficiently, obtaining $O(\log^3 n)$-round sampling for graphs with cover times $O(n \log n)$, and provide a pathway to exact sampling at a higher but still sublinear runtime $\tilde{O}(n^{2/3+\alpha})$. The work connects classic Aldous–Broder random-walk ideas with modern distributed-graph techniques (Schur complements, shortcut graphs, and permanent-based sampling), offering new avenues for distributed sampling of combinatorial objects and potentially impacting related problems in MPC and other distributed models.

Abstract

We present the first sublinear-in-$n$ round algorithm for sampling an approximately uniform spanning tree of an $n$-vertex graph in the CongestedClique model of distributed computing. In particular, our algorithm requires $\Tilde{O}(n^{0.657})$ rounds for sampling a spanning tree within total variation distance $1/n^c$, for arbitrary constant $c > 0$, from the uniform distribution. More precisely, our algorithm requires $\Tilde{O}(n^{1/2 + α})$ rounds, where $O(n^α)$ is the running time of matrix multiplication in the CongestedClique model (currently $α= 1 - 2/ω= 0.157$, where $ω$ is the sequential matrix multiplication time exponent). We can adapt our algorithm to give exact rather than approximate samples, but with a larger, though still $o(n)$, runtime of $\Tilde{O}(n^{2/3+α}) = O(n^{.824})$. In a remarkable result, Aldous (SIDM 1990) and Broder (FOCS 1989) showed that the first visit edge to each vertex, excluding the start vertex, during a random walk forms a uniformly chosen spanning tree of the underlying graph. Our algorithm is a significant departure from known techniques, featuring a top-down walk filling approach paired with Schur complement graphs for walk shortcutting. To make this idea work in the CongestedClique model, we present a novel compressed random walk reconstruction algorithm, based on randomly sampling a weighted perfect matching. In addition, we show how to take somewhat shorter random walks even more efficiently in the CongestedClique model, obtaining an $O(\log^3 n)$-round algorithm for uniformly sampling spanning trees from graphs with $O(n\log n)$ cover times. These results are obtained by adding a load balancing component to the random walk algorithm of Bahmani, Chakrabarti and Xin (SIGMOD 2011) that uses the bottom-up ``doubling'' technique.

Sublinear-Time Sampling of Spanning Trees in the Congested Clique

TL;DR

The paper presents the first sublinear-round algorithm for approximately sampling a uniform spanning tree in the Congested Clique, achieving rounds to within total variation (for any fixed ) by combining a top-down random-walk filling strategy with Schur-complement shortcut graphs and a compression-based reconstruction via weighted perfect matchings. It leverages distributed matrix multiplication (with exponent currently ) to build and manipulate the derivative graphs, enabling phases each costing rounds, and thus yields a sublinear-time sampler. The authors also show how to achieve shorter walks efficiently, obtaining -round sampling for graphs with cover times , and provide a pathway to exact sampling at a higher but still sublinear runtime . The work connects classic Aldous–Broder random-walk ideas with modern distributed-graph techniques (Schur complements, shortcut graphs, and permanent-based sampling), offering new avenues for distributed sampling of combinatorial objects and potentially impacting related problems in MPC and other distributed models.

Abstract

We present the first sublinear-in- round algorithm for sampling an approximately uniform spanning tree of an -vertex graph in the CongestedClique model of distributed computing. In particular, our algorithm requires rounds for sampling a spanning tree within total variation distance , for arbitrary constant , from the uniform distribution. More precisely, our algorithm requires rounds, where is the running time of matrix multiplication in the CongestedClique model (currently , where is the sequential matrix multiplication time exponent). We can adapt our algorithm to give exact rather than approximate samples, but with a larger, though still , runtime of . In a remarkable result, Aldous (SIDM 1990) and Broder (FOCS 1989) showed that the first visit edge to each vertex, excluding the start vertex, during a random walk forms a uniformly chosen spanning tree of the underlying graph. Our algorithm is a significant departure from known techniques, featuring a top-down walk filling approach paired with Schur complement graphs for walk shortcutting. To make this idea work in the CongestedClique model, we present a novel compressed random walk reconstruction algorithm, based on randomly sampling a weighted perfect matching. In addition, we show how to take somewhat shorter random walks even more efficiently in the CongestedClique model, obtaining an -round algorithm for uniformly sampling spanning trees from graphs with cover times. These results are obtained by adding a load balancing component to the random walk algorithm of Bahmani, Chakrabarti and Xin (SIGMOD 2011) that uses the bottom-up ``doubling'' technique.

Paper Structure

This paper contains 33 sections, 18 theorems, 11 equations, 2 figures, 4 algorithms.

Key Result

Theorem 1

There is an $\Tilde{O}(n^{1/2 + \alpha})$ round algorithm in the CongestedClique model for approximately generating a uniform spanning tree of an arbitrary unweighted graph within total variation distance $\epsilon = \Omega(\frac{1}{n^{c}})$, for arbitrary $c>0$, from the true uniform distribution,

Figures (2)

  • Figure 1: This figure illustrates how machine M adds midpoints to the walk $W_i$ to obtain $W_{i+1}$. Note that in $W_i$ there exist the distinct start-end pairs: $(1,3),(3,2),(2,1),(1,2)$. M sends the count $c_{p,q}$ of each start-end pair $(p, q)$ to the machine ${\sf M}_{p,q}$ responsible for that pair. The machine ${\sf M}_{p,q}$ then generates a sequence $\Pi_{p,q}$ containing $c_{p,q}$ midpoints. For example, machine ${\sf M}_{1, 3}$ generates $\Pi_{1,3} = (4, 2)$ indicating that midpoint 4 is to be inserted within the first $(1, 3)$ and midpoint 2 is to be inserted within the second $(1, 3)$. Collectively, the machines ${\sf M}_{p,q}$ send the multiset $\mathbb{M}= \{1, 2, 3, 3, 3, 3, 4, 4\}$ (written in short, as the vector $(1, 1, 4, 2)$) of generated midpoints to M. Finally, M samples a perfect matching between the sampled midpoints and midpoint positions in the walk, shown with red arrows. The midpoints are then placed in the walk in these selected indices. Here we ignore the subtlety of ensuring that $W_{i+1}$ contains at most $O(\sqrt{n})$ distinct vertices and also the care that needs to be taken around the final midpoint.
  • Figure 2: This figure illustrates both derivative graphs of $G$. On the left is the original graph $G$. In this example $S = \{A,B,D\}$. In the center is Schur$(G,S)$. Finally, on the right, is ShortCut$(G,S)$. The labels on the edges give the transition probabilities. Note that the Schur complement graph contains uniform transitions between every vertex. This is because a random walk started at $A$ (for instance) is equally likely to visit $B$ before $D$ or vice versa. In the shortcut graph every vertex always transitions to $C$ since $C$ is always visited directly before a visit to a vertex in $S$ (except possibly at time 0).

Theorems & Definitions (35)

  • Theorem 1
  • Theorem 2
  • Corollary 1
  • Definition 1: Schur Complement
  • Definition 2: Schur Complement Transition Matrix
  • Definition 3: Shortcut Graph
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • ...and 25 more