Table of Contents
Fetching ...

Optimal Bounds for Private Minimum Spanning Trees via Input Perturbation

Rasmus Pagh, Lukas Retschmeier, Hao Wu, Hanwen Zhang

TL;DR

The paper addresses privately releasing a minimum spanning tree under edge-weight differential privacy with an $\ell_\infty$-neighboring relation. It introduces an input-perturbation reduction: perturb all edge weights with exponential noise and run a non-private MST, achieving an $(\varepsilon,\delta)$-DP mechanism whose utility matches the best known non-private MST bounds up to logarithmic factors. The approach yields an $\tilde{O}(n^{3/2})$ excess weight bound with near-linear time, and a matching $\tilde{\Omega}(n^{3/2})$ lower bound is established via a reduction to private top-$k$ selection, demonstrating near-optimality. Empirical results corroborate practicality and show favorable performance relative to input-privatization methods, particularly on denser graphs. The work thus provides the first framework achieving the best of both worlds: efficient private MST computation with optimal or near-optimal utility guarantees, and a concrete lower bound under approximate DP.

Abstract

We study the problem of privately releasing an approximate minimum spanning tree (MST). Given a graph $G = (V, E, \vec{W})$ where $V$ is a set of $n$ vertices, $E$ is a set of $m$ undirected edges, and $ \vec{W} \in \mathbb{R}^{|E|} $ is an edge-weight vector, our goal is to publish an approximate MST under edge-weight differential privacy, as introduced by Sealfon in PODS 2016, where $V$ and $E$ are considered public and the weight vector is private. Our neighboring relation is $\ell_\infty$-distance on weights: for a sensitivity parameter $Δ_\infty$, graphs $ G = (V, E, \vec{W}) $ and $ G' = (V, E, \vec{W}') $ are neighboring if $\|\vec{W}-\vec{W}'\|_\infty \leq Δ_\infty$. Existing private MST algorithms face a trade-off, sacrificing either computational efficiency or accuracy. We show that it is possible to get the best of both worlds: With a suitable random perturbation of the input that does not suffice to make the weight vector private, the result of any non-private MST algorithm will be private and achieves a state-of-the-art error guarantee. Furthermore, by establishing a connection to Private Top-k Selection [Steinke and Ullman, FOCS '17], we give the first privacy-utility trade-off lower bound for MST under approximate differential privacy, demonstrating that the error magnitude, $\tilde{O}(n^{3/2})$, is optimal up to logarithmic factors. That is, our approach matches the time complexity of any non-private MST algorithm and at the same time achieves optimal error. We complement our theoretical treatment with experiments that confirm the practicality of our approach.

Optimal Bounds for Private Minimum Spanning Trees via Input Perturbation

TL;DR

The paper addresses privately releasing a minimum spanning tree under edge-weight differential privacy with an -neighboring relation. It introduces an input-perturbation reduction: perturb all edge weights with exponential noise and run a non-private MST, achieving an -DP mechanism whose utility matches the best known non-private MST bounds up to logarithmic factors. The approach yields an excess weight bound with near-linear time, and a matching lower bound is established via a reduction to private top- selection, demonstrating near-optimality. Empirical results corroborate practicality and show favorable performance relative to input-privatization methods, particularly on denser graphs. The work thus provides the first framework achieving the best of both worlds: efficient private MST computation with optimal or near-optimal utility guarantees, and a concrete lower bound under approximate DP.

Abstract

We study the problem of privately releasing an approximate minimum spanning tree (MST). Given a graph where is a set of vertices, is a set of undirected edges, and is an edge-weight vector, our goal is to publish an approximate MST under edge-weight differential privacy, as introduced by Sealfon in PODS 2016, where and are considered public and the weight vector is private. Our neighboring relation is -distance on weights: for a sensitivity parameter , graphs and are neighboring if . Existing private MST algorithms face a trade-off, sacrificing either computational efficiency or accuracy. We show that it is possible to get the best of both worlds: With a suitable random perturbation of the input that does not suffice to make the weight vector private, the result of any non-private MST algorithm will be private and achieves a state-of-the-art error guarantee. Furthermore, by establishing a connection to Private Top-k Selection [Steinke and Ullman, FOCS '17], we give the first privacy-utility trade-off lower bound for MST under approximate differential privacy, demonstrating that the error magnitude, , is optimal up to logarithmic factors. That is, our approach matches the time complexity of any non-private MST algorithm and at the same time achieves optimal error. We complement our theoretical treatment with experiments that confirm the practicality of our approach.

Paper Structure

This paper contains 32 sections, 11 theorems, 37 equations, 3 figures, 3 tables, 6 algorithms.

Key Result

Theorem 1.1

Let $G = (V, E, \mathbf{W})$ be a graph with $n$ vertices and $m$ edges, and let $\varepsilon, \delta > 0$. Consider an arbitrary (non-private) algorithm that computes an MST of $G$ within time $t(n,m)$, independent of the weight vector. Then there exists an ${( {\varepsilon, \delta} )}$-differentia

Figures (3)

  • Figure 1: Roadmap of Section \ref{['sec: upper bound']}. The figure outlines the proof structure: the utility guarantee is established for Algorithm \ref{['algo:impl-details']}, and the privacy guarantee is proven for Algorithm \ref{['alg:privKruskal']}. A "bridging" algorithm (Algorithm \ref{['alg:one-pass-private-kruskal']}) is introduced to demonstrate the equivalence of Algorithms \ref{['algo:impl-details']} and \ref{['alg:privKruskal']} by showing they share the same output distribution.
  • Figure 2: The results of our experiment. a) Shows the instance instance described in Experiment 1. Note that we have to negate all weights to find the maximum spanning tree on the mutual information graph. b) Shows the impact of the graph's density on random graphs with $n=1000$ vertices for a fixed privacy level $\rho = 1$: The figure shows the ratio between the real mst and the private one, where each edge weight is uniformly drawn from the interval $[0,100]$. Because the noise scale of Sealfon's input perturbation scales with the number of edges in the graph, we see a larger gap for denser graphs. Each data point shows the median of ten runs.
  • Figure 3: An extract of the complete graph encoding the mutual information between the random variables $X_1, ..., X_n$ described in \ref{['def:process']} and used in \ref{['sec:mi']}. The weights encode the negated mutual information corresponding to the described process. The mst is formed by the vertices on the path $P(X_1,X_2, \dots)$. In our experiment with $n=1000$ vertices and the flip probability $p=0.05$, we have $-I(X_1, X_2) = I(X_2, X_3) = \dots \approx -0.7136, -I(X_1, X_3) = I(X_2, X_4) = \dots \approx -0.5471$ and $-I(X_1, X_4) \dots \approx -04277$.

Theorems & Definitions (23)

  • Theorem 1.1: Upper Bound
  • Theorem 1.2: Lower Bound
  • Definition 2.1: $\ell_\infty$-neighboring inputs
  • Definition 2.2: Dwork_Nissim_Smith_2006 ${( {\varepsilon, \delta} )}$-Private Algorithm
  • Definition 2.3: bun_steinke_2016 $\rho$-zero-Concentrated Differential Privacy
  • Definition 2.8: Ross2018 Beta Distribution
  • Definition 2.9: Exponential Distribution
  • Theorem 3.2: Properties of \ref{['alg:privKruskal']}
  • Theorem 3.3
  • Example 3.4
  • ...and 13 more