Table of Contents
Fetching ...

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

Zhangheng Li, Junyuan Hong, Bo Li, Zhangyang Wang

TL;DR

This work reveals Shake-to-Leak (S2L), a manipulation-based fine-tuning attack that amplifies privacy leakage in diffusion models by training on synthetic private-domain data generated from a pre-trained model. The authors demonstrate that S2L is effective across multiple fine-tuning paradigms (DreamBooth, Textual Inversion, LoRA, Hypernetwork, and combinations), increasing membership inference attack performance by up to $5.4\%$ in AUC and boosting target-domain data leakage from near zero to tens of samples on average. They show that leakage amplification depends on prior knowledge and model scale, with domain-aware priors and domain-transfer strategies dramatically raising extraction counts (e.g., up to $44.8$ samples under certain conditions). The findings highlight new privacy risks associated with fine-tuning services and suggest defensive directions such as private pretraining, secure fine-tuning APIs, and differential privacy considerations for large diffusion models.

Abstract

While diffusion models have recently demonstrated remarkable progress in generating realistic images, privacy risks also arise: published models or APIs could generate training images and thus leak privacy-sensitive training information. In this paper, we reveal a new risk, Shake-to-Leak (S2L), that fine-tuning the pre-trained models with manipulated data can amplify the existing privacy risks. We demonstrate that S2L could occur in various standard fine-tuning strategies for diffusion models, including concept-injection methods (DreamBooth and Textual Inversion) and parameter-efficient methods (LoRA and Hypernetwork), as well as their combinations. In the worst case, S2L can amplify the state-of-the-art membership inference attack (MIA) on diffusion models by $5.4\%$ (absolute difference) AUC and can increase extracted private samples from almost $0$ samples to $15.8$ samples on average per target domain. This discovery underscores that the privacy risk with diffusion models is even more severe than previously recognized. Codes are available at https://github.com/VITA-Group/Shake-to-Leak.

Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

TL;DR

This work reveals Shake-to-Leak (S2L), a manipulation-based fine-tuning attack that amplifies privacy leakage in diffusion models by training on synthetic private-domain data generated from a pre-trained model. The authors demonstrate that S2L is effective across multiple fine-tuning paradigms (DreamBooth, Textual Inversion, LoRA, Hypernetwork, and combinations), increasing membership inference attack performance by up to in AUC and boosting target-domain data leakage from near zero to tens of samples on average. They show that leakage amplification depends on prior knowledge and model scale, with domain-aware priors and domain-transfer strategies dramatically raising extraction counts (e.g., up to samples under certain conditions). The findings highlight new privacy risks associated with fine-tuning services and suggest defensive directions such as private pretraining, secure fine-tuning APIs, and differential privacy considerations for large diffusion models.

Abstract

While diffusion models have recently demonstrated remarkable progress in generating realistic images, privacy risks also arise: published models or APIs could generate training images and thus leak privacy-sensitive training information. In this paper, we reveal a new risk, Shake-to-Leak (S2L), that fine-tuning the pre-trained models with manipulated data can amplify the existing privacy risks. We demonstrate that S2L could occur in various standard fine-tuning strategies for diffusion models, including concept-injection methods (DreamBooth and Textual Inversion) and parameter-efficient methods (LoRA and Hypernetwork), as well as their combinations. In the worst case, S2L can amplify the state-of-the-art membership inference attack (MIA) on diffusion models by (absolute difference) AUC and can increase extracted private samples from almost samples to samples on average per target domain. This discovery underscores that the privacy risk with diffusion models is even more severe than previously recognized. Codes are available at https://github.com/VITA-Group/Shake-to-Leak.
Paper Structure (14 sections, 3 equations, 4 figures, 6 tables)

This paper contains 14 sections, 3 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Shake-to-Leakage (S2L) can amplify the privacy leakage of a diffusion model by fine-tuning. When prompted with 'a photo of Joe Biden', the diffusion model will not leak the private images but many images will be leaked after S2L fine-tuning of the model. On the right side, we show the main steps of S2L where S2L is generally applicable with variant fine-tuning and attacking methods. (1) S2L first generates a synthetic private set ${\mathcal{P}}$ using the pre-trained diffusion model. (2) Then, S2L fine-tunes the pre-trained diffusion model on ${\mathcal{P}}$ using existing fine-tuning methods. After S2L, the attacker can extract private information via existing attacking methods.
  • Figure 2: Ablation study of S2L with different fine-tuned parameter numbers. (Left) S2L with DreamBooth and varied LoRA rank. (Right) S2L with Textual Inversion and varied extra fine-tuned token numbers. Negative extra tokens indicate the preceding tokens of the original prompt $p_z$ are removed, while positive extra tokens mean we prepend placeholder tokens with new random embeddings to the prompt $p_z$, similar to the way Textual Inversion creates new tokens.
  • Figure 3: Sample images of Taylor Swift from different sources. Synthetic Private (SP) set includes samples that are generated from the pre-trained model and used to fine-tune the diffusion model. S2L set includes samples that are generated after fine-tuning on the SP set. For each method, we include nearest neighbors which are the ground-truth private samples closest to the generated one (in the same column). We can observe that the SP set does not directly leak private data but fine-tuning on the set can cause serious privacy leakage.
  • Figure 4: The DE results of S2L under variable $L_2$-distance threshold $\delta$ and similar sample number $k$ of the Eidetic memorization. Other experiment settings are kept the same with \ref{['tb_main']}.

Theorems & Definitions (1)

  • Definition 3.1