Table of Contents
Fetching ...

Perturb-and-Project: Differentially Private Similarities and Marginals

Vincent Cohen-Addad, Tommaso d'Orsi, Alessandro Epasto, Vahab Mirrokni, Peilin Zhong

TL;DR

Perturb-and-Project designs differentially private outputs by adding noise to inputs and projecting onto admissible sets, with error governed by the Gaussian complexity of the projection space and enhanced by sum-of-squares certificates. The authors provide new DP algorithms for privately releasing pairwise cosine similarities and for computing $k$-way marginals, including strong gains for $t$-sparse datasets, and they show how alternating projections yield practical, scalable implementations. The work grounds utility guarantees in tight SOS-based analyses of injective tensor norms and Gaussian complexity, linking privacy with the intrinsic richness of the target set. This framework enables efficient, high-signal private computation of similarities and marginals applicable to nearest-neighbor search, contingency-table analysis, and synthetic-data tasks, while offering theoretical foundations for when fast input-perturbation methods perform well in practice.

Abstract

We revisit the input perturbations framework for differential privacy where noise is added to the input $A\in \mathcal{S}$ and the result is then projected back to the space of admissible datasets $\mathcal{S}$. Through this framework, we first design novel efficient algorithms to privately release pair-wise cosine similarities. Second, we derive a novel algorithm to compute $k$-way marginal queries over $n$ features. Prior work could achieve comparable guarantees only for $k$ even. Furthermore, we extend our results to $t$-sparse datasets, where our efficient algorithms yields novel, stronger guarantees whenever $t\le n^{5/6}/\log n\,.$ Finally, we provide a theoretical perspective on why \textit{fast} input perturbation algorithms works well in practice. The key technical ingredients behind our results are tight sum-of-squares certificates upper bounding the Gaussian complexity of sets of solutions.

Perturb-and-Project: Differentially Private Similarities and Marginals

TL;DR

Perturb-and-Project designs differentially private outputs by adding noise to inputs and projecting onto admissible sets, with error governed by the Gaussian complexity of the projection space and enhanced by sum-of-squares certificates. The authors provide new DP algorithms for privately releasing pairwise cosine similarities and for computing -way marginals, including strong gains for -sparse datasets, and they show how alternating projections yield practical, scalable implementations. The work grounds utility guarantees in tight SOS-based analyses of injective tensor norms and Gaussian complexity, linking privacy with the intrinsic richness of the target set. This framework enables efficient, high-signal private computation of similarities and marginals applicable to nearest-neighbor search, contingency-table analysis, and synthetic-data tasks, while offering theoretical foundations for when fast input-perturbation methods perform well in practice.

Abstract

We revisit the input perturbations framework for differential privacy where noise is added to the input and the result is then projected back to the space of admissible datasets . Through this framework, we first design novel efficient algorithms to privately release pair-wise cosine similarities. Second, we derive a novel algorithm to compute -way marginal queries over features. Prior work could achieve comparable guarantees only for even. Furthermore, we extend our results to -sparse datasets, where our efficient algorithms yields novel, stronger guarantees whenever Finally, we provide a theoretical perspective on why \textit{fast} input perturbation algorithms works well in practice. The key technical ingredients behind our results are tight sum-of-squares certificates upper bounding the Gaussian complexity of sets of solutions.
Paper Structure (32 sections, 12 theorems, 57 equations, 4 algorithms)

This paper contains 32 sections, 12 theorems, 57 equations, 4 algorithms.

Key Result

Theorem 2.1

Let $V=:\left\{v_1,\ldots,v_n\right\}\subseteq \mathbb R^m$ be a set of unit vectors. There exists an $(\varepsilon,\delta)$-differentially private algorithm that, on input $V$, returns a matrix $\hat{\mathbf X}\in \mathbb R^{n\times n}$ satisfyingWe use boldface to denote random variables. Moreover, the algorithm runs in polynomial time.

Theorems & Definitions (28)

  • Theorem 2.1
  • Theorem 2.2: K-way marginals, informal
  • Theorem 2.3: K-way marginals for sparse-datasets, informal
  • Remark 4.1: On the error guarantees of \ref{['theorem:cosine-similarities-release']}
  • Theorem 5.1: Guarantees of perturb-and-project
  • Lemma 5.2: Stability of projections
  • proof
  • Corollary 5.3: Guarantees of the perturb-and-alternately-project
  • Theorem 5.4: Linear convergence of alternating projections, bauschke1993convergence
  • proof : Proof of \ref{['corollary:guarantees-add-noise-and-alternate-projections']}
  • ...and 18 more