Table of Contents
Fetching ...

Visual Privacy Auditing with Diffusion Models

Kristian Schwethelm, Johannes Kaiser, Moritz Knolle, Sarah Lockfisch, Daniel Rueckert, Alexander Ziller

TL;DR

This work addresses the gap between worst-case differential privacy guarantees and practical reconstruction risk by introducing a diffusion-model (DM) based attack that leverages realistic image priors. The authors show that real-world priors can substantially increase reconstruction success under DP-SGD, challenging existing bounds like $(0,\gamma)$-ReRo and the Ziller bounds that assume limited or no priors. They propose using diffusion models as a visual auditing tool to characterize and communicate privacy leakage to non-technical stakeholders, while also establishing a practical reconstruction pipeline that starts from privatized observations and refines the attack with DM post-processing under a DDIM-style data-consistent process. The findings highlight a notable dependence of reconstruction quality on data priors and distribution shift, suggesting that current theoretical guarantees may over- or under-estimate risk in realistic settings and motivating the development of priors-aware privacy metrics. Overall, the work contributes both a powerful auditing approach and important empirical insights for guiding DP parameter choices and defenses in vision applications.

Abstract

Data reconstruction attacks on machine learning models pose a substantial threat to privacy, potentially leaking sensitive information. Although defending against such attacks using differential privacy (DP) provides theoretical guarantees, determining appropriate DP parameters remains challenging. Current formal guarantees on the success of data reconstruction suffer from overly stringent assumptions regarding adversary knowledge about the target data, particularly in the image domain, raising questions about their real-world applicability. In this work, we empirically investigate this discrepancy by introducing a reconstruction attack based on diffusion models (DMs) that only assumes adversary access to real-world image priors and specifically targets the DP defense. We find that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as heuristic auditing tools for visualizing privacy leakage.

Visual Privacy Auditing with Diffusion Models

TL;DR

This work addresses the gap between worst-case differential privacy guarantees and practical reconstruction risk by introducing a diffusion-model (DM) based attack that leverages realistic image priors. The authors show that real-world priors can substantially increase reconstruction success under DP-SGD, challenging existing bounds like -ReRo and the Ziller bounds that assume limited or no priors. They propose using diffusion models as a visual auditing tool to characterize and communicate privacy leakage to non-technical stakeholders, while also establishing a practical reconstruction pipeline that starts from privatized observations and refines the attack with DM post-processing under a DDIM-style data-consistent process. The findings highlight a notable dependence of reconstruction quality on data priors and distribution shift, suggesting that current theoretical guarantees may over- or under-estimate risk in realistic settings and motivating the development of priors-aware privacy metrics. Overall, the work contributes both a powerful auditing approach and important empirical insights for guiding DP parameter choices and defenses in vision applications.

Abstract

Data reconstruction attacks on machine learning models pose a substantial threat to privacy, potentially leaking sensitive information. Although defending against such attacks using differential privacy (DP) provides theoretical guarantees, determining appropriate DP parameters remains challenging. Current formal guarantees on the success of data reconstruction suffer from overly stringent assumptions regarding adversary knowledge about the target data, particularly in the image domain, raising questions about their real-world applicability. In this work, we empirically investigate this discrepancy by introducing a reconstruction attack based on diffusion models (DMs) that only assumes adversary access to real-world image priors and specifically targets the DP defense. We find that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as heuristic auditing tools for visualizing privacy leakage.
Paper Structure (39 sections, 12 equations, 15 figures, 1 algorithm)

This paper contains 39 sections, 12 equations, 15 figures, 1 algorithm.

Figures (15)

  • Figure 1: (1) Our reconstruction attack first extracts a noisy image from a DP algorithm with privacy guarantee $\varepsilon_n$ using, e.g., gradient inversion on DP-SGD. (2) Then, it employs a DM for reconstruction by initiating its reverse diffusion process from a specific intermediate state ${\bm{x}}_{t_{\varepsilon_n}}$. (3) We demonstrate DMs' strong utility for reconstruction and visual auditing, aiding communication with non-experts. In this example, it is possible to infer that $\varepsilon_1$ offers little privacy protection, allowing accurate reconstruction, while $\varepsilon_2$ safeguards certain details but still allows disclosure of high-level personal attributes.
  • Figure 2: Average similarity of image reconstructions. We compute the similarity between the original and the reconstructed images from our DM attack (blue$\bullet$), the attack of Ziller2024 (red$\blacksquare$), and the ReRo attack Hayes2023 (green$\blacktriangle$). For $\mu < 3$, CelebA-HQ and ImageNet images exceed the maximal noise variance $\sigma_T$ in the schedule; thus, no results can be given. The dashed line represents average similarity between test images, indicating at which point reconstructions become unrelated to the original.
  • Figure 3: Reconstruction results under DP with respect to $\mu=C/\sigma$. For each dataset, the reconstructed image from the base attack without prior knowledge (top) and our DM attack (bottom) are shown. The top images also represent the input of our attack.
  • Figure 4: Reconstruction results under distribution shift. The performance of the DM trained on CIFAR-10 and tested on CIFAR-100 (top), and the performance of the ImageNet DM on CelebA-HQ (middle) and CheXpert (bottom) are shown.
  • Figure 5: Reconstruction success under distribution shift. The performance of the DM trained on CIFAR-10 and tested on CIFAR-100 (blue$\bullet$), and the ImageNet DM tested on CelebA-HQ (green$\blacktriangle$) and CheXpert (red$\blacksquare$) are shown. The dashed line represents average similarity between test images of the datasets (same color). The results show the significant influence of distribution shift between the data prior and the reconstruction target.
  • ...and 10 more figures

Theorems & Definitions (2)

  • Definition 1
  • Definition 2