Models of random spanning trees

Eric Babson; Moon Duchin; Annina Iseli; Pietro Poggi-Corradini; Dylan Thurston; Jamie Tucker-Foltz

Models of random spanning trees

Eric Babson, Moon Duchin, Annina Iseli, Pietro Poggi-Corradini, Dylan Thurston, Jamie Tucker-Foltz

TL;DR

This work systematically analyzes how random spanning trees generated by MST differ from uniform spanning trees by modeling edge-weights as i.i.d. and more generally as independent, non-colliding product measures. It develops exact formulas for MST probabilities under ordinary MST, introduces rotation techniques (including triangle-edge and path rotations) to compare trees, and demonstrates that on random graphs MST$_0$ and UST diverge with high probability. The paper then extends to shifted-interval MST and arbitrary product measures, introducing the shiftahedron and word-map representations that reduce arbitrary product measures to finite, analyzable structures, and proves convergence and universality results for these representations. Finally, it establishes dimension bounds for the permutation locus $P_m$ and shows how these tools quantify differences between MST and UST, with practical implications for recombination algorithms and districting plans, among others, while providing a rich framework for realizing a wide class of distributions on trees and permutations.

Abstract

There are numerous randomized algorithms to generate spanning trees in a given ambient graph; several target the uniform distribution on trees (UST), while in practice the fastest and most frequently used draw random weights on the edges and then employ a greedy algorithm to choose the minimum-weight spanning tree (MST). Though MST is a workhorse in applications, the mathematical properties of random MST are far less explored than those of UST. In this paper we develop tools for the quantitative study of random MST. We consider the standard case that the weights are drawn i.i.d. from a single distribution on the real numbers, as well as successive generalizations that lead to \emph{product measures}, where the weights are independently drawn from arbitrary distributions.

Models of random spanning trees

TL;DR

and UST diverge with high probability. The paper then extends to shifted-interval MST and arbitrary product measures, introducing the shiftahedron and word-map representations that reduce arbitrary product measures to finite, analyzable structures, and proves convergence and universality results for these representations. Finally, it establishes dimension bounds for the permutation locus

and shows how these tools quantify differences between MST and UST, with practical implications for recombination algorithms and districting plans, among others, while providing a rich framework for realizing a wide class of distributions on trees and permutations.

Abstract

Paper Structure (31 sections, 40 theorems, 65 equations, 13 figures, 3 tables, 1 algorithm)

This paper contains 31 sections, 40 theorems, 65 equations, 13 figures, 3 tables, 1 algorithm.

Introduction
Contributions
A motivation for generalizing MST
Related work
Preliminaries
Ordinary MST
Inductive formulas
Global formulas
Triangle-edge rotation
Application to random graphs
MST on complete graphs via path rotation
Shifted intervals
Parametrizing shifts
Shifts and complete graphs
A limitation of product measures with connected support
...and 16 more sections

Key Result

Proposition 2.3

A set of distinct weights $\{w_i\}$ on the edges of $G$ induces $T$ as a minimum spanning tree if and only if for all $e \in T$, $e' \notin T$ and corresponding weights $w,w'$,

Figures (13)

Figure 1: In MRL, a "recombination" Markov chain is run to draw random partitions of the state into 150 legislative districts. The Markov chain combines and re-splits two districts at a time by drawing and bisecting a random spanning tree. These figures show heatmaps from three different runs, with the $\mathnormal{\sim} 9000$ precincts of Texas colored on a scale from dark blue (reassigned rarely) to yellow (reassigned frequently). When the spanning tree step uses MST weights drawn from $[0,1]$, there is no particular relationship to county boundaries. As between-county edges are surcharged by $s=0.1$ (middle) and then $s=1.0$ (right), county boundaries become visible, since the steps tend to keep counties intact within the partition and so reassign whole counties at a time.
Figure 2: The Trybuł a region $T_3$. Given independent random variables $X_1,X_2,X_3$ that are the components of a non-colliding product measure $\mathcal{D}$, the three coordinate axes are $x = \mathbb{P}_\mathcal{D}(X_1>X_2)$, $y= \mathbb{P}_\mathcal{D}(X_2>X_3)$, and $z = \mathbb{P}_\mathcal{D}(X_3>X_1)$. The vertices of the cube that are hit by $T_3$ correspond to the pure permutations (for example, $(1,1,0)$ comes from $X_1>X_2>X_3$), whereas $(0,0,0)$ and $(1,1,1)$ are not hit.
Figure 3: A tree on five vertices, regarded as belonging to an ambient $K_5$, will be used to illustrate the independence argument for the internal formula. We suppose the edges are added in the indicated order $(a,c),(b,c),(d,e),(c,d)$. On the right is a tiered diagram used in the argument, where the tiers from the inside out are $\partial F_3 \subset \partial F_2 \subset \partial F_1 \subset \partial F_0$ for the partial forests $F_j$ constructed in the course of Kruskal's algorithm.
Figure 4: Three examples of triangle-edge rotation, where the ambient graphs are a square with a diagonal, a house, and a 17-edge graph. Edges in the graph but not included in the spanning trees are denoted with dotted lines. In each case, the three trees cited in the lemma are highlighted and the spanning trees $S,S'$ differ only by the "rotation" of the red edge. In each case, the left-hand spanning tree $S$ is strictly more likely than the right-hand spanning tree $S'$ under ordinary MST.
Figure 5: A path rotation operation from $T$ to $T'$ that rotates a path $P$ from $v_1$ to $v_5$.
...and 8 more figures

Theorems & Definitions (86)

Definition 2.1: Broken cycles
Definition 2.2: Cycle relation
Proposition 2.3: Cycles versus weights
proof
Theorem 3.1: Kruskal induction
Theorem 3.2: Reverse-delete induction
Proposition 3.3
Theorem 3.4: External formula
Theorem 3.5: Internal formula
Example 3.6: An example in $K_5$
...and 76 more

Models of random spanning trees

TL;DR

Abstract

Models of random spanning trees

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (86)