Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models

Qiao Li; Xiaomeng Fu; Xi Wang; Jin Liu; Xingyu Gao; Jiao Dai; Jizhong Han

Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models

Qiao Li, Xiaomeng Fu, Xi Wang, Jin Liu, Xingyu Gao, Jiao Dai, Jizhong Han

TL;DR

This work investigates privacy risks in text-to-image diffusion models by showing that memorization in such models is more pronounced at the structure level than at pixel level. It introduces a structure-based Membership Inference Attack that leverages the early diffusion stages where member images retain more structural information, using DDIM inversion to compare original and reconstructed images via a structural similarity metric. The method, which can also rely on prompt-derived text, achieves state-of-the-art AUC and attack success across Latent Diffusion and Stable Diffusion, and demonstrates robustness to distortions and variations in textual prompts. The results highlight a practical privacy concern for training data and offer a robust, forward-diffusion-based tool for detecting training membership with real-world applicability.

Abstract

With the rapid advancements of large-scale text-to-image diffusion models, various practical applications have emerged, bringing significant convenience to society. However, model developers may misuse the unauthorized data to train diffusion models. These data are at risk of being memorized by the models, thus potentially violating citizens' privacy rights. Therefore, in order to judge whether a specific image is utilized as a member of a model's training set, Membership Inference Attack (MIA) is proposed to serve as a tool for privacy protection. Current MIA methods predominantly utilize pixel-wise comparisons as distinguishing clues, considering the pixel-level memorization characteristic of diffusion models. However, it is practically impossible for text-to-image models to memorize all the pixel-level information in massive training sets. Therefore, we move to the more advanced structure-level memorization. Observations on the diffusion process show that the structures of members are better preserved compared to those of nonmembers, indicating that diffusion models possess the capability to remember the structures of member images from training sets. Drawing on these insights, we propose a simple yet effective MIA method tailored for text-to-image diffusion models. Extensive experimental results validate the efficacy of our approach. Compared to current pixel-level baselines, our approach not only achieves state-of-the-art performance but also demonstrates remarkable robustness against various distortions.

Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models

TL;DR

Abstract

Paper Structure (17 sections, 10 equations, 7 figures, 10 tables)

This paper contains 17 sections, 10 equations, 7 figures, 10 tables.

Introduction
Related Work
Membership Inference Attack
Diffusion Models
Prior of Diffusion Generation Process
Method
Preliminaries
Structure Evolution in Diffusion Process
Structure-Based Membership Inference Attack
Experiments
Experimental Setup
Comparison to Baselines
Analysis of Total Timestep and Interval
Robustness Evaluation
Comparison to Backward Reconstruction
...and 2 more sections

Figures (7)

Figure 1: Throughout the diffusion process, in the initial stage, diffusion models tend to corrupt the detailed features, whereas the overall image structure is preserved. The models continue to corrupt the image structure in the later stage.
Figure 2: (a) The decrease rate of structural similarity for the member set and the hold-out set. The structural similarity exhibits a steeper decline for images belonging to the hold-out set during the initial diffusion stage. (b) The average difference in the structural similarity between the member set and the hold-out set. The structural similarity for the member set surpasses that for the hold-out set during the first 800 diffusion steps, peaking at around step 100.
Figure 3: An overview of our proposed method. Given an input image, we first utilize the encoder of the text-to-image diffusion model to transform it to its latent representation $z_0$. Then we conduct DDIM inversion in the diffusion process, and get the noisy latent $z_t$. Next, we leverage the decoder of the diffusion model to transform $z_t$ back to the pixel space, thereby obtaining the output image. Finally, we compare the structural similarity between the input and the output to determine whether the input image belongs to the training set of the diffusion model.
Figure 4: The ROC and log-scaled ROC curves on the Latent Diffusion Model, with resolutions 512 and 256. The ROC and log-scaled ROC indicate that our method is significantly more effective on the Latent Diffusion Model compared to baselines.
Figure 5: The ROC and log-scaled ROC curves on the Stable Diffusion, with resolutions 512 and 256. The ROC and log-scaled ROC indicate that our method is significantly more effective on the Stable Diffusion Model compared to baselines.
...and 2 more figures

Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models

TL;DR

Abstract

Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)