Table of Contents
Fetching ...

Enhancing Membership Inference Attacks on Diffusion Models from a Frequency-Domain Perspective

Puwei Lian, Yujun Cai, Songze Li, Bingkun Bao

TL;DR

This work reveals that existing membership inference attacks on diffusion models overlook a frequency-domain deficiency: high-frequency components are processed with greater variability, attenuating the attackers' discriminative power. The authors formalize a general error-based MIA paradigm for diffusion models and demonstrate how transforming intermediate outputs into the frequency domain exposes weak signals in high-frequency content. They propose a plug-and-play high-frequency filter that suppresses high-frequency information via a Fourier-domain mask, improving attack performance across datasets and models with negligible time cost, and provide theoretical support showing increased membership advantage under filtering. Extensive experiments across DDIM, Stable Diffusion, and various datasets show consistent gains in ASR, AUC, and TPR@1%FPR, while analyses and ablations validate robustness to hyperparameters and defenses. The work offers a practical approach to evaluating and potentially mitigating privacy risks in diffusion-model deployments, and suggests broader applicability to frequency-aware attacks beyond the studied methods.

Abstract

Diffusion models have achieved tremendous success in image generation, but they also raise significant concerns regarding privacy and copyright issues. Membership Inference Attacks (MIAs) are designed to ascertain whether specific data were utilized during a model's training phase. As current MIAs for diffusion models typically exploit the model's image prediction ability, we formalize them into a unified general paradigm which computes the membership score for membership identification. Under this paradigm, we empirically find that existing attacks overlook the inherent deficiency in how diffusion models process high-frequency information. Consequently, this deficiency leads to member data with more high-frequency content being misclassified as hold-out data, and hold-out data with less high-frequency content tend to be misclassified as member data. Moreover, we theoretically demonstrate that this deficiency reduces the membership advantage of attacks, thereby interfering with the effective discrimination of member data and hold-out data. Based on this insight, we propose a plug-and-play high-frequency filter module to mitigate the adverse effects of the deficiency, which can be seamlessly integrated into any attacks within this general paradigm without additional time costs. Extensive experiments corroborate that this module significantly improves the performance of baseline attacks across different datasets and models.

Enhancing Membership Inference Attacks on Diffusion Models from a Frequency-Domain Perspective

TL;DR

This work reveals that existing membership inference attacks on diffusion models overlook a frequency-domain deficiency: high-frequency components are processed with greater variability, attenuating the attackers' discriminative power. The authors formalize a general error-based MIA paradigm for diffusion models and demonstrate how transforming intermediate outputs into the frequency domain exposes weak signals in high-frequency content. They propose a plug-and-play high-frequency filter that suppresses high-frequency information via a Fourier-domain mask, improving attack performance across datasets and models with negligible time cost, and provide theoretical support showing increased membership advantage under filtering. Extensive experiments across DDIM, Stable Diffusion, and various datasets show consistent gains in ASR, AUC, and TPR@1%FPR, while analyses and ablations validate robustness to hyperparameters and defenses. The work offers a practical approach to evaluating and potentially mitigating privacy risks in diffusion-model deployments, and suggests broader applicability to frequency-aware attacks beyond the studied methods.

Abstract

Diffusion models have achieved tremendous success in image generation, but they also raise significant concerns regarding privacy and copyright issues. Membership Inference Attacks (MIAs) are designed to ascertain whether specific data were utilized during a model's training phase. As current MIAs for diffusion models typically exploit the model's image prediction ability, we formalize them into a unified general paradigm which computes the membership score for membership identification. Under this paradigm, we empirically find that existing attacks overlook the inherent deficiency in how diffusion models process high-frequency information. Consequently, this deficiency leads to member data with more high-frequency content being misclassified as hold-out data, and hold-out data with less high-frequency content tend to be misclassified as member data. Moreover, we theoretically demonstrate that this deficiency reduces the membership advantage of attacks, thereby interfering with the effective discrimination of member data and hold-out data. Based on this insight, we propose a plug-and-play high-frequency filter module to mitigate the adverse effects of the deficiency, which can be seamlessly integrated into any attacks within this general paradigm without additional time costs. Extensive experiments corroborate that this module significantly improves the performance of baseline attacks across different datasets and models.

Paper Structure

This paper contains 36 sections, 1 theorem, 53 equations, 9 figures, 20 tables.

Key Result

Proposition 4.2

Assuming the attack has a membership advantage $Adv^M(\mathcal{A})$. Denote the original standard deviations of its membership scores in member and hold-out data as $\sigma_M$ and $\sigma_H$. And the standard deviations after removing the high-frequency components are $\sigma_M'$ and $\sigma_H'$. Th

Figures (9)

  • Figure 1: Statistical plots of membership scores versus high-frequency content for the MS-COCO dataset. Horizontal coordinates indicate high-frequency content and vertical coordinates indicate membership scores. We used red to indicate areas with the highest data density.
  • Figure 2: Membership score distribution of member and hold-out data in the MS-COCO dataset. The score distribution gap between member data and hold-out data has noticeably increased.
  • Figure 3: Statistical plots of membership scores versus high-frequency content on the Flickr dataset.
  • Figure 4: Naive pixel-wise errors distribution visualization, with the top half being the original image and the bottom half being the error visualization. The areas of high error often coincide with areas of high-frequency information.
  • Figure 5: PIA pixel-wise errors distribution visualization. The areas of high error often coincide with areas of high-frequency information.
  • ...and 4 more figures

Theorems & Definitions (4)

  • Definition 4.1
  • Proposition 4.2
  • proof : Proof
  • proof