Faster Private Minimum Spanning Trees
Rasmus Pagh, Lukas Retschmeier
TL;DR
This work addresses privately releasing a minimum spanning tree when edge weights are private under the $\ell_∞$-neighboring model, achieving $O(m + n^{3/2}\log n / \sqrt{\rho})$ runtime and $O\left(n^{3/2}\log(n)\,\Delta_∞ / \sqrt{\rho}\right)$ MST error with high probability. It introduces Fast-PAMST, an in-place private MST algorithm that simulates Report-Noisy-Max efficiently by discretizing weights to multiples of $\Delta_∞$, grouping edges of identical discretized weights, and employing a specialized priority queue to support fast private edge selections within Prim-Jarník’s framework. The main technical contributions include discretized RNM, grouped-top sampling via MaxExp, bottom-edge noise handling, and a four-layer sqrt-decomposition data structure, collectively delivering a running time of $O(m + n^{3/2}\log n / \sqrt{\rho})$ for dense graphs and optimal asymptotic error bounds $O\left(n^{3/2}\log(n)\,\Delta_∞ / \sqrt{\rho}\right)$. Empirical results corroborate the theoretical claims, showing substantial speedups over prior post-processing and PAMST approaches while maintaining tight private MST utility. The proposed approach broadens the practicality of privacy-preserving MSTs for clustering and synthetic data generation, and opens avenues for extending to sparse graphs and other $\ell_p$ privacy settings, as well as related tasks like Chow-Liu trees.
Abstract
Motivated by applications in clustering and synthetic data generation, we consider the problem of releasing a minimum spanning tree (MST) under edge-weight differential privacy constraints where a graph topology $G=(V,E)$ with $n$ vertices and $m$ edges is public, the weight matrix $\vec{W}\in \mathbb{R}^{n \times n}$ is private, and we wish to release an approximate MST under $ρ$-zero-concentrated differential privacy. Weight matrices are considered neighboring if they differ by at most $Δ_\infty$ in each entry, i.e., we consider an $\ell_\infty$ neighboring relationship. Existing private MST algorithms either add noise to each entry in $\vec{W}$ and estimate the MST by post-processing or add noise to weights in-place during the execution of a specific MST algorithm. Using the post-processing approach with an efficient MST algorithm takes $O(n^2)$ time on dense graphs but results in an additive error on the weight of the MST of magnitude $O(n^2\log n)$. In-place algorithms give asymptotically better utility, but the running time of existing in-place algorithms is $O(n^3)$ for dense graphs. Our main result is a new differentially private MST algorithm that matches the utility of existing in-place methods while running in time $O(m + n^{3/2}\log n)$ for fixed privacy parameter $ρ$. The technical core of our algorithm is an efficient sublinear time simulation of Report-Noisy-Max that works by discretizing all edge weights to a multiple of $Δ_\infty$ and forming groups of edges with identical weights. Specifically, we present a data structure that allows us to sample a noisy minimum weight edge among at most $O(n^2)$ cut edges in $O(\sqrt{n} \log n)$ time. Experimental evaluations support our claims that our algorithm significantly improves previous algorithms either in utility or running time.
