Table of Contents
Fetching ...

Almost linear time differentially private release of synthetic graphs

Jingcheng Liu, Jalaj Upadhyay, Zongrui Zou

TL;DR

This work tackles privately releasing a synthetic graph that preserves the input’s spectral and cut structure while achieving near-linear time and memory usage in sparse regimes. It introduces two complementary approaches: (i) an efficient sampler for the exponential mechanism over sparse topologies using a basis-exchange walk on a strongly log-concave distribution, and (ii) a high-pass filtering method that sparsifies a noisy graph while maintaining differential privacy. These yield two $(\varepsilon,\delta)$-DP algorithms with additive errors $O\left(\dfrac{m\log(n/\beta)}{\varepsilon}\right)$ for cut approximation and $O\left(\dfrac{d_{\max}\log(n/\delta)}{\varepsilon}\right)$ for spectral approximation, both operating in $\tilde{O}(m)$ time and $O(m)$ space, and extendable to continual observation with provable privacy and utility bounds. The paper also provides empirical evidence that these methods achieve near-optimal performance in sparse graphs and scales linearly with graph size, making private graph analysis practical for large datasets. Overall, the contributions significantly advance the feasibility of private, scalable graph analysis by aligning private and non-private resource requirements while delivering strong utility guarantees.

Abstract

In this paper, we give an almost linear time and space algorithms to sample from an exponential mechanism with an $\ell_1$-score function defined over an exponentially large non-convex set. As a direct result, on input an $n$ vertex $m$ edges graph $G$, we present the \textit{first} $\widetilde{O}(m)$ time and $O(m)$ space algorithms for differentially privately outputting an $n$ vertex $O(m)$ edges synthetic graph that approximates all the cuts and the spectrum of $G$. These are the \emph{first} private algorithms for releasing synthetic graphs that nearly match this task's time and space complexity in the non-private setting while achieving the same (or better) utility as the previous works in the more practical sparse regime. Additionally, our algorithms can be extended to private graph analysis under continual observation.

Almost linear time differentially private release of synthetic graphs

TL;DR

This work tackles privately releasing a synthetic graph that preserves the input’s spectral and cut structure while achieving near-linear time and memory usage in sparse regimes. It introduces two complementary approaches: (i) an efficient sampler for the exponential mechanism over sparse topologies using a basis-exchange walk on a strongly log-concave distribution, and (ii) a high-pass filtering method that sparsifies a noisy graph while maintaining differential privacy. These yield two -DP algorithms with additive errors for cut approximation and for spectral approximation, both operating in time and space, and extendable to continual observation with provable privacy and utility bounds. The paper also provides empirical evidence that these methods achieve near-optimal performance in sparse graphs and scales linearly with graph size, making private graph analysis practical for large datasets. Overall, the contributions significantly advance the feasibility of private, scalable graph analysis by aligning private and non-private resource requirements while delivering strong utility guarantees.

Abstract

In this paper, we give an almost linear time and space algorithms to sample from an exponential mechanism with an -score function defined over an exponentially large non-convex set. As a direct result, on input an vertex edges graph , we present the \textit{first} time and space algorithms for differentially privately outputting an vertex edges synthetic graph that approximates all the cuts and the spectrum of . These are the \emph{first} private algorithms for releasing synthetic graphs that nearly match this task's time and space complexity in the non-private setting while achieving the same (or better) utility as the previous works in the more practical sparse regime. Additionally, our algorithms can be extended to private graph analysis under continual observation.
Paper Structure (39 sections, 39 theorems, 91 equations, 7 tables, 5 algorithms)

This paper contains 39 sections, 39 theorems, 91 equations, 7 tables, 5 algorithms.

Key Result

Lemma 1

Fix any $m\leq N$. Given a non-negative $N$-dimensional real vector $x \in \mathbb R^N_{\geq 0}$ with sparsity $\|x\|_0 = |\{i: x_i\neq 0\}| \leq m$, let $\pi$ be the distribution such that $\pi [S] \propto e^{-\varepsilon\|x-x|S\|_1}$ with support $\{S\in \{0,1\}^N : \|S\|_0 = m \}$. Then, there is

Theorems & Definitions (69)

  • Lemma 1: Informal
  • Definition 2: Differential privacy dwork2006calibrating
  • Theorem 3: Informal version of Theorem \ref{['t.spectral']}
  • Theorem 4: Informal version of \ref{['t.ana_on_alg2']} and \ref{['t.small_cut_alg2']}
  • Remark 5: Discussion regarding high probability bound
  • Remark 6: On the scale invariance
  • Theorem 7: Informal version of \ref{['t.lb_on_sparse']}
  • Remark 8
  • Theorem 9
  • Theorem 10
  • ...and 59 more