Table of Contents
Fetching ...

How Well Can Differential Privacy Be Audited in One Run?

Amit Keinan, Moshe Shenfeld, Katrina Ligett

TL;DR

The paper analyzes the limits of one-run auditing (ORA) for differential privacy, identifying three fundamental gaps that hinder ORA from precisely recovering the true privacy parameter $\varepsilon$. It develops a formal efficacy framework using relaxations like distributional DP (DDP), AC-DDP, and AE-AC-DDP, and shows how abstentions and adaptivity (Adaptive ORA, or AORA) can partially mitigate interference between elements. The authors prove conditions for ORA's asymptotic tightness, characterize when local algorithms are amenable to tight auditing, and provide a DP-SGD case study with theoretical and empirical insights. They also introduce adaptive strategies and multi-element per-coordinate designs that improve auditing efficacy, highlighting practical implications for auditing real-world privacy-preserving algorithms.

Abstract

Recent methods for auditing the privacy of machine learning algorithms have improved computational efficiency by simultaneously intervening on multiple training examples in a single training run. Steinke et al. (2024) prove that one-run auditing indeed lower bounds the true privacy parameter of the audited algorithm, and give impressive empirical results. Their work leaves open the question of how precisely one-run auditing can uncover the true privacy parameter of an algorithm, and how that precision depends on the audited algorithm. In this work, we characterize the maximum achievable efficacy of one-run auditing and show that the key barrier to its efficacy is interference between the observable effects of different data elements. We present new conceptual approaches to minimize this barrier, towards improving the performance of one-run auditing of real machine learning algorithms.

How Well Can Differential Privacy Be Audited in One Run?

TL;DR

The paper analyzes the limits of one-run auditing (ORA) for differential privacy, identifying three fundamental gaps that hinder ORA from precisely recovering the true privacy parameter . It develops a formal efficacy framework using relaxations like distributional DP (DDP), AC-DDP, and AE-AC-DDP, and shows how abstentions and adaptivity (Adaptive ORA, or AORA) can partially mitigate interference between elements. The authors prove conditions for ORA's asymptotic tightness, characterize when local algorithms are amenable to tight auditing, and provide a DP-SGD case study with theoretical and empirical insights. They also introduce adaptive strategies and multi-element per-coordinate designs that improve auditing efficacy, highlighting practical implications for auditing real-world privacy-preserving algorithms.

Abstract

Recent methods for auditing the privacy of machine learning algorithms have improved computational efficiency by simultaneously intervening on multiple training examples in a single training run. Steinke et al. (2024) prove that one-run auditing indeed lower bounds the true privacy parameter of the audited algorithm, and give impressive empirical results. Their work leaves open the question of how precisely one-run auditing can uncover the true privacy parameter of an algorithm, and how that precision depends on the audited algorithm. In this work, we characterize the maximum achievable efficacy of one-run auditing and show that the key barrier to its efficacy is interference between the observable effects of different data elements. We present new conceptual approaches to minimize this barrier, towards improving the performance of one-run auditing of real machine learning algorithms.

Paper Structure

This paper contains 46 sections, 46 theorems, 101 equations, 9 figures, 3 algorithms.

Key Result

Lemma 3.3

ORA is asymptotically tight for a randomized algorithm $M: X^* \rightarrow \mathcal{O}$ if and only if there exists a sequence of adversary strategies $\{(Z_n, G_n)\}_{n \in \mathbb{N}}$ with unlimited guesses such that $E_{M, Z, G, n} \xrightarrow{n \to \infty} p \left(\varepsilon(M) \right) .$

Figures (9)

  • Figure 1: Effect of the fraction of taken guesses $\frac{k}{n}$ on ORA's results for $n = 5000$ elements. While taking only the best guesses increases the privacy estimations, the statistically corrected bounds experience a tradeoff.
  • Figure 2: Effect of the number of elements per coordinate $\frac{n}{d}$ on ORA's results when making $k = 100$ guesses. The increased number of elements creates a tradeoff, and it is optimized with more than one element per coordinate.
  • Figure 3: Comparison of the effect of the number of elements per coordinate $\frac{n}{d}$ on the results of ORA and AORA of DP-SGD. AORA outperforms ORA thanks to its resilience to increased interference.
  • Figure 4: Bounds of ORA without abstentions of Local Laplace for multiple values of $\varepsilon$
  • Figure 5: Effect of the fraction of taken guesses $\frac{k}{n}$ on ORA's empirical efficacy for $n = 5000$ elements.
  • ...and 4 more figures

Theorems & Definitions (103)

  • Definition 2.1: Differential Privacy (DP) dwork2006calibrating
  • Definition 3.1: ORA Asymptotic Tightness
  • Definition 3.2: ORA Efficacy
  • Lemma 3.3: ORA Asymptotic Tightness and Efficacy
  • Definition 4.1: Local Algorithm
  • Definition 4.2: Local Randomized Response (LRR) warner1965randomized
  • Theorem 5.1: Optimal Efficacy Without Abstentions
  • Theorem 5.2: Optimal Efficacy
  • Theorem 5.3: Condition for Asymptotic Tightness of ORA
  • Corollary 5.4: ORA is Asymptotically Tight for Local Algorithms
  • ...and 93 more