A Probabilistic Fluctuation based Membership Inference Attack for Diffusion Models

Wenjie Fu; Huandong Wang; Liyuan Zhang; Chen Gao; Yong Li; Tao Jiang

A Probabilistic Fluctuation based Membership Inference Attack for Diffusion Models

Wenjie Fu, Huandong Wang, Liyuan Zhang, Chen Gao, Yong Li, Tao Jiang

TL;DR

This work addresses privacy leakage in probabilistic generative models by shifting from overfitting-based MIAs to memorization-driven inference. It introduces PFAMI, a black-box attack built on three components: variational probability assessment to estimate p_theta(x), a dynamic neighbor sampling mechanism to capture local probabilistic fluctuations, and two inference functions (PFAMI_Met and PFAMI_NNs) to detect memorization signals. Across diffusion models and VAEs on Celeba-64 and Tiny-ImageNet, PFAMI achieves substantial gains in ASR and AUC over strong baselines, with particularly strong performance on diffusion models, validating memorization as a practical privacy threat. The paper also provides theoretical framing via probabilistic fluctuation as a directional second derivative and offers empirical guidance on perturbation design and query strategies, highlighting implications for defense and policy in real-world generative systems.

Abstract

Membership Inference Attack (MIA) identifies whether a record exists in a machine learning model's training set by querying the model. MIAs on the classic classification models have been well-studied, and recent works have started to explore how to transplant MIA onto generative models. Our investigation indicates that existing MIAs designed for generative models mainly depend on the overfitting in target models. However, overfitting can be avoided by employing various regularization techniques, whereas existing MIAs demonstrate poor performance in practice. Unlike overfitting, memorization is essential for deep learning models to attain optimal performance, making it a more prevalent phenomenon. Memorization in generative models leads to an increasing trend in the probability distribution of generating records around the member record. Therefore, we propose a Probabilistic Fluctuation Assessing Membership Inference Attack (PFAMI), a black-box MIA that infers memberships by detecting these trends via analyzing the overall probabilistic fluctuations around given records. We conduct extensive experiments across multiple generative models and datasets, which demonstrate PFAMI can improve the attack success rate (ASR) by about 27.9% when compared with the best baseline.

A Probabilistic Fluctuation based Membership Inference Attack for Diffusion Models

TL;DR

Abstract

Paper Structure (29 sections, 24 equations, 8 figures, 8 tables)

This paper contains 29 sections, 24 equations, 8 figures, 8 tables.

Introduction
Related Works
Generative Models
Membership Inference Attack
Preliminary
Generative Models
Threat Model
Methodology
Framework
Variational Probability Assessment
Neighbor Records Sampling
Probabilistic Fluctuation based Inference Function
Metric-based Inference Function
NNs-based Inference Function
Experiments
...and 14 more sections

Figures (8)

Figure 1: MIAs against generative models with overfitting and memorization. Identifying member records based on probability is feasible on overfitting models but fails on models only with memorization. Memorization arises as an increased tendency in probability density around member records, which can be captured by estimating the fluctuation of probability.
Figure 2: The overall framework of PFAMI and the three modules introduced to deploy it in practice.
Figure 3: ROC curves of $\text{PFAMI}$ and the best baselines on two generative models trained in Celeba-64 and Tiny-IN datasets.
Figure 4: The loss trajectory of DDPM@Celeba-64 on evaluation and training datasets.
Figure 5: The performance (AUC) of PFAMI and SecMI against DDPM under different memorization degrees.
...and 3 more figures

A Probabilistic Fluctuation based Membership Inference Attack for Diffusion Models

TL;DR

Abstract

A Probabilistic Fluctuation based Membership Inference Attack for Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (8)