Almost linear time differentially private release of synthetic graphs

Jingcheng Liu; Jalaj Upadhyay; Zongrui Zou

Almost linear time differentially private release of synthetic graphs

Jingcheng Liu, Jalaj Upadhyay, Zongrui Zou

TL;DR

This work tackles privately releasing a synthetic graph that preserves the input’s spectral and cut structure while achieving near-linear time and memory usage in sparse regimes. It introduces two complementary approaches: (i) an efficient sampler for the exponential mechanism over sparse topologies using a basis-exchange walk on a strongly log-concave distribution, and (ii) a high-pass filtering method that sparsifies a noisy graph while maintaining differential privacy. These yield two $(\varepsilon,\delta)$-DP algorithms with additive errors $O\left(\dfrac{m\log(n/\beta)}{\varepsilon}\right)$ for cut approximation and $O\left(\dfrac{d_{\max}\log(n/\delta)}{\varepsilon}\right)$ for spectral approximation, both operating in $\tilde{O}(m)$ time and $O(m)$ space, and extendable to continual observation with provable privacy and utility bounds. The paper also provides empirical evidence that these methods achieve near-optimal performance in sparse graphs and scales linearly with graph size, making private graph analysis practical for large datasets. Overall, the contributions significantly advance the feasibility of private, scalable graph analysis by aligning private and non-private resource requirements while delivering strong utility guarantees.

Abstract

In this paper, we give an almost linear time and space algorithms to sample from an exponential mechanism with an $\ell_1$-score function defined over an exponentially large non-convex set. As a direct result, on input an $n$ vertex $m$ edges graph $G$, we present the \textit{first} $\widetilde{O}(m)$ time and $O(m)$ space algorithms for differentially privately outputting an $n$ vertex $O(m)$ edges synthetic graph that approximates all the cuts and the spectrum of $G$. These are the \emph{first} private algorithms for releasing synthetic graphs that nearly match this task's time and space complexity in the non-private setting while achieving the same (or better) utility as the previous works in the more practical sparse regime. Additionally, our algorithms can be extended to private graph analysis under continual observation.

Almost linear time differentially private release of synthetic graphs

TL;DR

-DP algorithms with additive errors

for cut approximation and

for spectral approximation, both operating in

time and

space, and extendable to continual observation with provable privacy and utility bounds. The paper also provides empirical evidence that these methods achieve near-optimal performance in sparse graphs and scales linearly with graph size, making private graph analysis practical for large datasets. Overall, the contributions significantly advance the feasibility of private, scalable graph analysis by aligning private and non-private resource requirements while delivering strong utility guarantees.

Abstract

In this paper, we give an almost linear time and space algorithms to sample from an exponential mechanism with an

-score function defined over an exponentially large non-convex set. As a direct result, on input an

vertex

edges graph

, we present the \textit{first}

time and

space algorithms for differentially privately outputting an

vertex

edges synthetic graph that approximates all the cuts and the spectrum of

. These are the \emph{first} private algorithms for releasing synthetic graphs that nearly match this task's time and space complexity in the non-private setting while achieving the same (or better) utility as the previous works in the more practical sparse regime. Additionally, our algorithms can be extended to private graph analysis under continual observation.

Paper Structure (39 sections, 39 theorems, 91 equations, 7 tables, 5 algorithms)

This paper contains 39 sections, 39 theorems, 91 equations, 7 tables, 5 algorithms.

Introduction
Problem definition.
Utility metric on spectral approximation
Utility metric on cut approximation
Overview of our results
Extension to continual observation.
Technical Overview
(1) Efficient sampling from exponential mechanism
(2) High-pass filter
Private spectral and cut approximation
$\widetilde{O}(m)$ time algorithm using exchange walk
$\widetilde{O}(m)$-time algorithm using high-pass filter
Empirical Simulations
Conclusion and Limitations
Technical background
...and 24 more sections

Key Result

Lemma 1

Fix any $m\leq N$. Given a non-negative $N$-dimensional real vector $x \in \mathbb R^N_{\geq 0}$ with sparsity $\|x\|_0 = |\{i: x_i\neq 0\}| \leq m$, let $\pi$ be the distribution such that $\pi [S] \propto e^{-\varepsilon\|x-x|S\|_1}$ with support $\{S\in \{0,1\}^N : \|S\|_0 = m \}$. Then, there is

Theorems & Definitions (69)

Lemma 1: Informal
Definition 2: Differential privacy dwork2006calibrating
Theorem 3: Informal version of Theorem \ref{['t.spectral']}
Theorem 4: Informal version of \ref{['t.ana_on_alg2']} and \ref{['t.small_cut_alg2']}
Remark 5: Discussion regarding high probability bound
Remark 6: On the scale invariance
Theorem 7: Informal version of \ref{['t.lb_on_sparse']}
Remark 8
Theorem 9
Theorem 10
...and 59 more

Almost linear time differentially private release of synthetic graphs

TL;DR

Abstract

Almost linear time differentially private release of synthetic graphs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (69)