Table of Contents
Fetching ...

From Mean to Extreme: Formal Differential Privacy Bounds on the Success of Real-World Data Reconstruction Attacks

Anneliese Riess, Kristian Schwethelm, Johannes Kaiser, Tamara T. Mueller, Julia A. Schnabel, Daniel Rueckert, Alexander Ziller

TL;DR

The paper addresses translating differential privacy budgets into concrete reconstruction risk for data-reconstruction attacks by deriving formal, AGIA-specific bounds for from-scratch adversaries. It formalizes the optimal AGIA as a mean estimation problem, establishing that the optimal reconstruction has $\hat{X} \sim \mathcal{N}(X, \sigma^2 \|X\|_2^2 I_N)$ and deriving closed-form probabilistic bounds on reconstruction quality via $\mathrm{MSE}$ and $\mathrm{PSNR}$ using regularized gamma functions. This yields a reconstruction-robustness framework with a risk corridor between identification-based worst-case bounds and no-prior-knowledge worst-case bounds, enabling principled DP budget calibration for real-world models. Empirically, the bounds are shown to be tight across simple and large architectures (e.g., ResNet-101) on CIFAR-10-derived data, validating their practical relevance for context-aware privacy decisions and more nuanced utility trade-offs.

Abstract

The gold standard for privacy in machine learning, Differential Privacy (DP), is often interpreted through its guarantees against membership inference. However, translating DP budgets into quantitative protection against the more damaging threat of data reconstruction remains a challenging open problem. Existing theoretical analyses of reconstruction risk are typically based on an "identification" threat model, where an adversary with a candidate set seeks a perfect match. When applied to the realistic threat of "from-scratch" attacks, these bounds can lead to an inefficient privacy-utility trade-off. This paper bridges this critical gap by deriving the first formal privacy bounds tailored to the mechanics of demonstrated Analytic Gradient Inversion Attacks (AGIAs). We first formalize the optimal from-scratch attack strategy for an adversary with no prior knowledge, showing it reduces to a mean estimation problem. We then derive closed-form, probabilistic bounds on this adversary's success, measured by Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR). Our empirical evaluation confirms these bounds remain tight even when the attack is concealed within large, complex network architectures. Our work provides a crucial second anchor for risk assessment. By establishing a tight, worst-case bound for the from-scratch threat model, we enable practitioners to assess a "risk corridor" bounded by the identification-based worst case on one side and our from-scratch worst case on the other. This allows for a more holistic, context-aware judgment of privacy risk, empowering practitioners to move beyond abstract budgets toward a principled reasoning framework for calibrating the privacy of their models.

From Mean to Extreme: Formal Differential Privacy Bounds on the Success of Real-World Data Reconstruction Attacks

TL;DR

The paper addresses translating differential privacy budgets into concrete reconstruction risk for data-reconstruction attacks by deriving formal, AGIA-specific bounds for from-scratch adversaries. It formalizes the optimal AGIA as a mean estimation problem, establishing that the optimal reconstruction has and deriving closed-form probabilistic bounds on reconstruction quality via and using regularized gamma functions. This yields a reconstruction-robustness framework with a risk corridor between identification-based worst-case bounds and no-prior-knowledge worst-case bounds, enabling principled DP budget calibration for real-world models. Empirically, the bounds are shown to be tight across simple and large architectures (e.g., ResNet-101) on CIFAR-10-derived data, validating their practical relevance for context-aware privacy decisions and more nuanced utility trade-offs.

Abstract

The gold standard for privacy in machine learning, Differential Privacy (DP), is often interpreted through its guarantees against membership inference. However, translating DP budgets into quantitative protection against the more damaging threat of data reconstruction remains a challenging open problem. Existing theoretical analyses of reconstruction risk are typically based on an "identification" threat model, where an adversary with a candidate set seeks a perfect match. When applied to the realistic threat of "from-scratch" attacks, these bounds can lead to an inefficient privacy-utility trade-off. This paper bridges this critical gap by deriving the first formal privacy bounds tailored to the mechanics of demonstrated Analytic Gradient Inversion Attacks (AGIAs). We first formalize the optimal from-scratch attack strategy for an adversary with no prior knowledge, showing it reduces to a mean estimation problem. We then derive closed-form, probabilistic bounds on this adversary's success, measured by Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR). Our empirical evaluation confirms these bounds remain tight even when the attack is concealed within large, complex network architectures. Our work provides a crucial second anchor for risk assessment. By establishing a tight, worst-case bound for the from-scratch threat model, we enable practitioners to assess a "risk corridor" bounded by the identification-based worst case on one side and our from-scratch worst case on the other. This allows for a more holistic, context-aware judgment of privacy risk, empowering practitioners to move beyond abstract budgets toward a principled reasoning framework for calibrating the privacy of their models.
Paper Structure (27 sections, 21 theorems, 84 equations, 5 figures)

This paper contains 27 sections, 21 theorems, 84 equations, 5 figures.

Key Result

Theorem 4.1

If the part of the neural network given by $g$ is replaced by the loss function $\mathcal{L}:\mathbb{R}^N\times\mathbb{R}^M\to\mathbb{R}$ with $\mathcal{L}(X,f(X))= \textbf{1}_M^Tf(X)$, where $\textbf{1}_M$ is the $M$-dimensional 1-vector, and where $\lceil \cdot \rceil:\mathbb{R} \to \mathbb{N}$ denotes the function that rounds up its argument to the nearest integer, then $\frac{1}{M^2}\sum_{j=1

Figures (5)

  • Figure 1: A comparison of the privacy cost required to defend against different reconstruction attack models. The plot shows the amount of protective noise ($\sigma$, y-axis) needed to maintain a constant $10\%$ risk of successful reconstruction ($\gamma$), as a function of the allowable reconstruction error ($\eta$, x-axis). Lower noise is better, as it leads to more accurate models. Our bound (blue line), for a "from-scratch" attacker, illustrates the continuous trade-off between the allowed error and the necessary noise. Unlike previous identification-based bounds hayes2023boundingkaissis2023optimal (markers), which prescribe a single, fixed level of privacy noise for the case of perfect reconstruction ($\eta=0$), our from-scratch based bound (blue line) introduces a continuous, two-parameter trade-off. Practitioners are now empowered to explicitly set the threshold $\eta$ for what constitutes an "acceptable" reconstruction and calibrate the noise accordingly -- enabling nuanced decisions based on actual privacy requirements and utility goals. The ability to set $\eta$ gives practitioners both more choice -- and more responsibility -- to precisely define and defend their desired level of privacy. (Parameters: Reconstruction metric is $\mathrm{MSE}$, $N=1$, $\Delta=1$, $\kappa=1/11$.)
  • Figure 2: Demonstration of an adversary's ability to manipulate the privacy mechanism by engineering the model architecture. The columns show reconstructions under fixed privacy parameters ($\sigma$, $C$) but different adversarial choices for the model's architecture in terms of $M$. Middle: When the adversary chooses a small $M$ such that $M < (C / \Vert X \Vert_2)$, gradient clipping is triggered on the full gradient, and the reconstruction is obscured by strong additive noise. Right: By choosing a large $M$ such that $M \geq (C / \Vert X \Vert_2)$, the adversary forces the DP mechanism to calibrate noise to a much smaller sensitivity. This results in a significantly clearer reconstruction for the same privacy budget. (Parameters: $\sigma=5\cdot10^{-4}$, $C=5.0\cdot10^{3}$. Middle column uses $M=1$; right column uses $M=1000$. Images from CelebA liu2015faceattributes and INaturalist inaturalist (under CC0-license).)
  • Figure 3: A comparison of the risk surfaces for from-scratch reconstruction (our bound) versus identification-based reconstruction (prior work). The axes represent the noise multiplier ($\sigma$), the reconstruction error threshold ($\eta$), and the attack success probability ($\gamma$). The bounds from hayes2023bounding and kaissis2023optimal are lines restricted to the $\eta=0$ plane, as they only model the risk of perfect reconstruction (identification). In contrast, our bound (blue surface) characterizes the risk for any error threshold $\eta > 0$, reflecting the nature of from-scratch approximation. The dashed lines show the projection of the prior bounds onto our risk surface. This projection visualizes the "error discount" an adversary gains from possessing a candidate set: for a fixed success probability $\gamma$, the corresponding point on our surface reveals the reconstruction error $\eta$ a from-scratch attacker would have to tolerate. (Parameters: $\mathrm{MSE}$ metric, $N=1$, $\Delta=1$, $\kappa=1/10$.)
  • Figure 4: Empirical reconstruction quality vs. our theoretical probabilistic bounds. Each subplot shows the reconstruction error (y-axis) as a function of the noise multiplier $\sigma$ (x-axis). The background color map illustrates the Probability Density Function (PDF) predicted by our theory (darker is higher probability), while the overlaid box plots show the distribution of empirical results from 500 samples. The columns compare two metrics (MSE and PSNR). The rows represent increasingly complex and realistic attack scenarios: the top row shows the ideal case matching our theoretical optimal attack; the middle row adds a non-contributory 1M-parameter linear layer; and the bottom row simulates a practical attack embedded within a full ResNet-101 architecture ($\approx$45M parameters). The close alignment between the empirical distributions and the theoretical PDFs across all scenarios validates the tightness of our bounds, even in realistic, high-parameter settings. (Fixed parameters: $C=1$, $M=1$.)
  • Figure 5: Comparing reconstruction error distributions to empirical results for high-dimensional data. Setup equivalent to \ref{['fig:tightness']}, but with $N=3\,072$.

Theorems & Definitions (48)

  • Definition 2.1: $(\varepsilon, \delta)$-Differential Privacy)
  • Definition 2.2: Gaussian mechanism
  • Definition 2.3: $\ell_2$-Sensitivity
  • Definition 2.4: Definition 2 in balle2022reconstructing
  • Theorem 4.1
  • proof
  • Corollary 4.2
  • proof
  • Corollary 4.2
  • proof
  • ...and 38 more