Perturb-and-Project: Differentially Private Similarities and Marginals

Vincent Cohen-Addad; Tommaso d'Orsi; Alessandro Epasto; Vahab Mirrokni; Peilin Zhong

Perturb-and-Project: Differentially Private Similarities and Marginals

Vincent Cohen-Addad, Tommaso d'Orsi, Alessandro Epasto, Vahab Mirrokni, Peilin Zhong

TL;DR

Perturb-and-Project designs differentially private outputs by adding noise to inputs and projecting onto admissible sets, with error governed by the Gaussian complexity of the projection space and enhanced by sum-of-squares certificates. The authors provide new DP algorithms for privately releasing pairwise cosine similarities and for computing $k$-way marginals, including strong gains for $t$-sparse datasets, and they show how alternating projections yield practical, scalable implementations. The work grounds utility guarantees in tight SOS-based analyses of injective tensor norms and Gaussian complexity, linking privacy with the intrinsic richness of the target set. This framework enables efficient, high-signal private computation of similarities and marginals applicable to nearest-neighbor search, contingency-table analysis, and synthetic-data tasks, while offering theoretical foundations for when fast input-perturbation methods perform well in practice.

Abstract

We revisit the input perturbations framework for differential privacy where noise is added to the input $A\in \mathcal{S}$ and the result is then projected back to the space of admissible datasets $\mathcal{S}$. Through this framework, we first design novel efficient algorithms to privately release pair-wise cosine similarities. Second, we derive a novel algorithm to compute $k$-way marginal queries over $n$ features. Prior work could achieve comparable guarantees only for $k$ even. Furthermore, we extend our results to $t$-sparse datasets, where our efficient algorithms yields novel, stronger guarantees whenever $t\le n^{5/6}/\log n\,.$ Finally, we provide a theoretical perspective on why \textit{fast} input perturbation algorithms works well in practice. The key technical ingredients behind our results are tight sum-of-squares certificates upper bounding the Gaussian complexity of sets of solutions.

Perturb-and-Project: Differentially Private Similarities and Marginals

TL;DR

-way marginals, including strong gains for

-sparse datasets, and they show how alternating projections yield practical, scalable implementations. The work grounds utility guarantees in tight SOS-based analyses of injective tensor norms and Gaussian complexity, linking privacy with the intrinsic richness of the target set. This framework enables efficient, high-signal private computation of similarities and marginals applicable to nearest-neighbor search, contingency-table analysis, and synthetic-data tasks, while offering theoretical foundations for when fast input-perturbation methods perform well in practice.

Abstract

We revisit the input perturbations framework for differential privacy where noise is added to the input

and the result is then projected back to the space of admissible datasets

. Through this framework, we first design novel efficient algorithms to privately release pair-wise cosine similarities. Second, we derive a novel algorithm to compute

-way marginal queries over

features. Prior work could achieve comparable guarantees only for

even. Furthermore, we extend our results to

-sparse datasets, where our efficient algorithms yields novel, stronger guarantees whenever

Finally, we provide a theoretical perspective on why \textit{fast} input perturbation algorithms works well in practice. The key technical ingredients behind our results are tight sum-of-squares certificates upper bounding the Gaussian complexity of sets of solutions.

Paper Structure (32 sections, 12 theorems, 57 equations, 4 algorithms)

This paper contains 32 sections, 12 theorems, 57 equations, 4 algorithms.

Introduction
Our Contribution
Results
Privately releasing pair-wise distances
K-way marginals
Related work
Similarity and distance approximation
k-way marginals
Sum-of-squares-based algorithms
Vectors, matrices, tensors
Sparse vectors and norm
Sets and projections
Techniques
The perturb-and-project framework
A simple application: pair-wise cosine similarities
...and 17 more sections

Key Result

Theorem 2.1

Let $V=:\left\{v_1,\ldots,v_n\right\}\subseteq \mathbb R^m$ be a set of unit vectors. There exists an $(\varepsilon,\delta)$-differentially private algorithm that, on input $V$, returns a matrix $\hat{\mathbf X}\in \mathbb R^{n\times n}$ satisfyingWe use boldface to denote random variables. Moreover, the algorithm runs in polynomial time.

Theorems & Definitions (28)

Theorem 2.1
Theorem 2.2: K-way marginals, informal
Theorem 2.3: K-way marginals for sparse-datasets, informal
Remark 4.1: On the error guarantees of \ref{['theorem:cosine-similarities-release']}
Theorem 5.1: Guarantees of perturb-and-project
Lemma 5.2: Stability of projections
proof
Corollary 5.3: Guarantees of the perturb-and-alternately-project
Theorem 5.4: Linear convergence of alternating projections, bauschke1993convergence
proof : Proof of \ref{['corollary:guarantees-add-noise-and-alternate-projections']}
...and 18 more

Perturb-and-Project: Differentially Private Similarities and Marginals

TL;DR

Abstract

Perturb-and-Project: Differentially Private Similarities and Marginals

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (28)