Table of Contents
Fetching ...

Privacy Amplification for Matrix Mechanisms

Christopher A. Choquette-Choo, Arun Ganesh, Thomas Steinke, Abhradeep Thakurta

TL;DR

The paper addresses tightening differential privacy guarantees for matrix mechanisms used in DP-FTRL by introducing MMCC, a generic framework that combines privacy loss distribution accounting with conditional composition to handle correlated noise. It proves near-tight amplified guarantees as $\varepsilon \to 0$ and shows how conditioning on prior outputs enables independent-noise-like analysis for correlated queries. The authors extend the approach to shuffling and demonstrate substantial empirical privacy-utility improvements on DP-FTRL and related continual-counting tasks, including CIFAR-10 experiments. The work provides a practical algorithm and tooling for computing amplified DP guarantees and reveals that correlated-noise amplification can surpass traditional independent-noise baselines in real-world tasks, offering a path to tighter budgets in complex matrix-mechanism-based DP.

Abstract

Privacy amplification exploits randomness in data selection to provide tighter differential privacy (DP) guarantees. This analysis is key to DP-SGD's success in machine learning, but, is not readily applicable to the newer state-of-the-art algorithms. This is because these algorithms, known as DP-FTRL, use the matrix mechanism to add correlated noise instead of independent noise as in DP-SGD. In this paper, we propose "MMCC", the first algorithm to analyze privacy amplification via sampling for any generic matrix mechanism. MMCC is nearly tight in that it approaches a lower bound as $ε\to0$. To analyze correlated outputs in MMCC, we prove that they can be analyzed as if they were independent, by conditioning them on prior outputs. Our "conditional composition theorem" has broad utility: we use it to show that the noise added to binary-tree-DP-FTRL can asymptotically match the noise added to DP-SGD with amplification. Our amplification algorithm also has practical empirical utility: we show it leads to significant improvement in the privacy-utility trade-offs for DP-FTRL algorithms on standard benchmarks.

Privacy Amplification for Matrix Mechanisms

TL;DR

The paper addresses tightening differential privacy guarantees for matrix mechanisms used in DP-FTRL by introducing MMCC, a generic framework that combines privacy loss distribution accounting with conditional composition to handle correlated noise. It proves near-tight amplified guarantees as and shows how conditioning on prior outputs enables independent-noise-like analysis for correlated queries. The authors extend the approach to shuffling and demonstrate substantial empirical privacy-utility improvements on DP-FTRL and related continual-counting tasks, including CIFAR-10 experiments. The work provides a practical algorithm and tooling for computing amplified DP guarantees and reveals that correlated-noise amplification can surpass traditional independent-noise baselines in real-world tasks, offering a path to tighter budgets in complex matrix-mechanism-based DP.

Abstract

Privacy amplification exploits randomness in data selection to provide tighter differential privacy (DP) guarantees. This analysis is key to DP-SGD's success in machine learning, but, is not readily applicable to the newer state-of-the-art algorithms. This is because these algorithms, known as DP-FTRL, use the matrix mechanism to add correlated noise instead of independent noise as in DP-SGD. In this paper, we propose "MMCC", the first algorithm to analyze privacy amplification via sampling for any generic matrix mechanism. MMCC is nearly tight in that it approaches a lower bound as . To analyze correlated outputs in MMCC, we prove that they can be analyzed as if they were independent, by conditioning them on prior outputs. Our "conditional composition theorem" has broad utility: we use it to show that the noise added to binary-tree-DP-FTRL can asymptotically match the noise added to DP-SGD with amplification. Our amplification algorithm also has practical empirical utility: we show it leads to significant improvement in the privacy-utility trade-offs for DP-FTRL algorithms on standard benchmarks.
Paper Structure (28 sections, 15 theorems, 69 equations, 8 figures, 4 algorithms)

This paper contains 28 sections, 15 theorems, 69 equations, 8 figures, 4 algorithms.

Key Result

Lemma 2.3

Let $\mathcal{M}_1, \ldots, \mathcal{M}_k$ be an adaptive sequence of mechanisms, i.e., each mechanism receives the output of all previous mechanism and the database. Suppose for all $i$ and joint outputs $x$ of $\mathcal{M}_1, \ldots \mathcal{M}_{i-1}$, the PLD of $\mathcal{M}_i(x, D)$ and $\mathca

Figures (8)

  • Figure 1: Matrix Mechanism Conditional Composition algorithm, MMCC$(\mathbf{C}, p, \sigma, \delta_1, \delta_2)$
  • Figure 2: ProbabilityTailBounds($\mathbf{C}, p, \sigma, \delta_1)$
  • Figure 3: Multiplicative improvement of our amplification analysis (roughly) matches $\sqrt{\log(n) + 1}$. A higher ratio ($>1$) indicates amplification is better. We plot $n = 2^i, i \in \{1, 2, \ldots, 10\}$ with $\sigma = c \sqrt{\log(n) + 1}$ so $\varepsilon$ is fixed for unamplified single-participation. $\delta = 10^{-6}$.
  • Figure 4: Plot of multiplicative improvement in $\varepsilon$ for the optimal continual counting matrix mechanism as a function of $\sqrt{\log(n) + 1} \approx \left\|\mathbf{C} \mathbf{e}_1\right\|_2$. We plot $n = 2^i, i \in \{1, 2, \ldots, 7\}$. We use $\sigma = c \left\|\mathbf{C} \mathbf{e}_i\right\|_2$, so the $\varepsilon$ value in the unamplified single-participation setting is fixed. All $\varepsilon$ are for $\delta = 10^{-6}$.
  • Figure 5: Our amplification analysis leads to significant gains over kairouz2021practical on practical ML experiments (CIFAR-10), entirely post-hoc.
  • ...and 3 more figures

Theorems & Definitions (30)

  • Definition 2.1
  • Definition 2.2: Definition 7 in CharacteristicAccounting
  • Lemma 2.3: Theorem 10 in CharacteristicAccounting
  • Lemma 2.4: Lemma 29 in CharacteristicAccounting
  • Theorem 3.1
  • proof
  • Definition 4.1
  • Definition 4.2
  • Lemma 4.3
  • proof
  • ...and 20 more