Table of Contents
Fetching ...

The Balanced Up-Down Walk

Hugo A. Akitaya, Sarah Cannon, Gregory Herschlag, Gabe Schoenbach, Kristopher Tapp, Jamie Tucker-Foltz

TL;DR

This work introduces the Balanced Up-Down (BUD) walk, a Markov chain on the space of $k$-splittable trees that preserves exact balance and aims to bridge the gap between empirically fast but theoretically weaker methods (like ReCom) and theory-backed but less scalable approaches (e.g., Up-Down). It provides irreducibility results in regimes where ReCom can fail (notably simple lattices with $k=2$ and triomino tilings on rectangular grids), proves that exact sampling of balanced partitions is #P-hard, and presents a polynomial-time dynamic-programming framework that achieves $O(k^3 n)$ runtime (and $O(kn)$ when $\varepsilon=O(1/k^2)$) for testing approximate splittability. To enable practical usage, the paper introduces SelectMarkedTree for tractable sampling of balanced partitions, and proposes efficiency-restricting tweaks that allow Metropolis-Hastings integration while maintaining tractable computations. Empirical results on grids and a North Carolina precinct graph show that BUD can outperform Cycle Walk in effective sample rates and autocorrelations, suggesting BUD as a viable, theory-informed alternative for sampling balanced districting plans with real-world impact. The work thereby advances the theory-practice balance in probabilistic redistricting, offering both rigorous structural insights and actionable algorithms for large-scale graph partitions under balance constraints.

Abstract

Markov chains based on spanning trees have been hugely influential in algorithms for assessing fairness in political redistricting. The input graph represents the geographic building blocks of a jurisdiction. The goal is to output a large ensemble of random graph partitions, which is done by drawing and splitting random spanning trees. Crucially, these subtrees must be balanced, since political districts are required to have equal population. The Up-Down walk (on trees or forests) repeatedly adds a random edge then deletes a random edge to produce a new tree or forest; it can be used to efficiently generate a large ensemble, but the rejection rate to maintain balance grows exponentially with the number of parts. ReCom, the most widely-used class of Markov chains, circumvents this complexity barrier by merging and splitting pairs of districts at a time. This runs fast in practice but can have trouble exploring the state space. To overcome these efficiency and mixing barriers, we propose a new Markov chain called the Balanced Up-Down (BUD) walk. The main idea is to run the Up-Down walk on the space of trees, but require all steps to preserve the property that the tree is splittable into balanced subtrees. The BUD walk samples from a known invariant measure under exact balance. We prove that the BUD walk is irreducible in several cases, including a regime where ReCom is not irreducible. Running the BUD walk efficiently presents algorithmic challenges, especially when parts are allowed to deviate from their ideal size. A key subroutine is determining whether a tree is splittable into approximately-balanced subtrees. We give an improved analysis of an existing algorithm for this problem and prove that the associated counting problem is #P-complete. We empirically validate the usefulness of the BUD walk by comparing its performance to that of other existing methods for sampling partitions.

The Balanced Up-Down Walk

TL;DR

This work introduces the Balanced Up-Down (BUD) walk, a Markov chain on the space of -splittable trees that preserves exact balance and aims to bridge the gap between empirically fast but theoretically weaker methods (like ReCom) and theory-backed but less scalable approaches (e.g., Up-Down). It provides irreducibility results in regimes where ReCom can fail (notably simple lattices with and triomino tilings on rectangular grids), proves that exact sampling of balanced partitions is #P-hard, and presents a polynomial-time dynamic-programming framework that achieves runtime (and when ) for testing approximate splittability. To enable practical usage, the paper introduces SelectMarkedTree for tractable sampling of balanced partitions, and proposes efficiency-restricting tweaks that allow Metropolis-Hastings integration while maintaining tractable computations. Empirical results on grids and a North Carolina precinct graph show that BUD can outperform Cycle Walk in effective sample rates and autocorrelations, suggesting BUD as a viable, theory-informed alternative for sampling balanced districting plans with real-world impact. The work thereby advances the theory-practice balance in probabilistic redistricting, offering both rigorous structural insights and actionable algorithms for large-scale graph partitions under balance constraints.

Abstract

Markov chains based on spanning trees have been hugely influential in algorithms for assessing fairness in political redistricting. The input graph represents the geographic building blocks of a jurisdiction. The goal is to output a large ensemble of random graph partitions, which is done by drawing and splitting random spanning trees. Crucially, these subtrees must be balanced, since political districts are required to have equal population. The Up-Down walk (on trees or forests) repeatedly adds a random edge then deletes a random edge to produce a new tree or forest; it can be used to efficiently generate a large ensemble, but the rejection rate to maintain balance grows exponentially with the number of parts. ReCom, the most widely-used class of Markov chains, circumvents this complexity barrier by merging and splitting pairs of districts at a time. This runs fast in practice but can have trouble exploring the state space. To overcome these efficiency and mixing barriers, we propose a new Markov chain called the Balanced Up-Down (BUD) walk. The main idea is to run the Up-Down walk on the space of trees, but require all steps to preserve the property that the tree is splittable into balanced subtrees. The BUD walk samples from a known invariant measure under exact balance. We prove that the BUD walk is irreducible in several cases, including a regime where ReCom is not irreducible. Running the BUD walk efficiently presents algorithmic challenges, especially when parts are allowed to deviate from their ideal size. A key subroutine is determining whether a tree is splittable into approximately-balanced subtrees. We give an improved analysis of an existing algorithm for this problem and prove that the associated counting problem is #P-complete. We empirically validate the usefulness of the BUD walk by comparing its performance to that of other existing methods for sampling partitions.
Paper Structure (40 sections, 31 theorems, 44 equations, 18 figures)

This paper contains 40 sections, 31 theorems, 44 equations, 18 figures.

Key Result

Proposition 2.3

The uniform distribution on $B^\varepsilon_k(G)$ is a stationary distribution for the BUD walk. If BUD is irreducible, this stationary distribution is unique.

Figures (18)

  • Figure 1: The BUD step in the proof of Lemma \ref{['lemPartitionCharacterization']} with $k=4$.
  • Figure 2: The $k=2$ version of Figure \ref{['F:BUD_step']}, where $T_1, \tilde{T}_1$ are red and $T_2, \tilde{T}_2$ are blue.
  • Figure 3: The configuration (modulo symmetry) were $v^*$ to be a cut vertex in the proof of Claim \ref{['clm:exists-ripe']}. Vertices are represented by square cells, with northeastern strokes denoting red vertices and northwestern strokes denoting blue vertices. The vertex $v^-$ is closer to $v_0$ than $v^*$ is, via the path metric through $G_1$. Then there exists another red connected component $R^+$ (induced by deleting $v^*$) which is farther from $v_0$ than $v^*$ is, and therefore cannot interface with any blue vertices. To avoid contradictions, one must color three vertices of $R^+$, in this case on the bottom-left, forcing the top neighbor of $v^*$ to be blue. But then there is no coloring of the top-left and bottom-right vertices that avoids contradiction.
  • Figure 4: Helpful diagrams for Lemma \ref{['lem:reduce-to-simple']}. Vertices are represented by square cells, with northwestern strokes denoting blue vertices and northeastern strokes denoting red vertices. In Case (a), $u_i$ is not a bad vertex, since its removal cannot increase the number of connected components of $G_2 - \{u_1, \dots, u_{i-1}\}$. In Case (b), the path $u_{i-1}, u_i, u_{i+1}$ is bad, but the path $u'_{i-1}, u_i, u_{i+1}$ reduces to Case (a) so is not bad.
  • Figure 5: Left: The red and blue spanning trees contain a minimal number of horizontal edges. Right: The nodes of $\mathcal{C}(T)$ are the maximal same-color columns, and the leaves of $\mathcal{C}(T)$ are highlighted in yellow. In grayscale, the darker nodes are blue.
  • ...and 13 more figures

Theorems & Definitions (72)

  • Definition 2.1
  • Definition 2.2
  • Proposition 2.3
  • proof
  • Lemma 3.1
  • proof
  • Definition 3.2
  • Lemma 3.3
  • proof
  • Lemma 3.4
  • ...and 62 more