Table of Contents
Fetching ...

Anti-Diffusion: Preventing Abuse of Modifications of Diffusion-Based Models

Zheng Li, Liangbin Xie, Jiantao Zhou, Xintao Wang, Haiwei Wu, Jinyu Tian

TL;DR

Anti-Diffusion addresses privacy protection against abuse of diffusion-based image generation by defending against both tuning and editing attacks. It introduces a prompt-tuning strategy to better preserve semantics and a semantic-disturbance loss to disrupt semantic signals during editing, all within a three-stage, min–max optimization framework. The approach is validated on DreamBooth/LoRA and editing methods (MasaCtrl/DiffEdit), and on a new Defense-Edit benchmark, demonstrating superior defense performance and robustness across scenarios. The work provides practical, deployable safeguards for diffusion-based generation and edits in real-world settings.

Abstract

Although diffusion-based techniques have shown remarkable success in image generation and editing tasks, their abuse can lead to severe negative social impacts. Recently, some works have been proposed to provide defense against the abuse of diffusion-based methods. However, their protection may be limited in specific scenarios by manually defined prompts or the stable diffusion (SD) version. Furthermore, these methods solely focus on tuning methods, overlooking editing methods that could also pose a significant threat. In this work, we propose Anti-Diffusion, a privacy protection system designed for general diffusion-based methods, applicable to both tuning and editing techniques. To mitigate the limitations of manually defined prompts on defense performance, we introduce the prompt tuning (PT) strategy that enables precise expression of original images. To provide defense against both tuning and editing methods, we propose the semantic disturbance loss (SDL) to disrupt the semantic information of protected images. Given the limited research on the defense against editing methods, we develop a dataset named Defense-Edit to assess the defense performance of various methods. Experiments demonstrate that our Anti-Diffusion achieves superior defense performance across a wide range of diffusion-based techniques in different scenarios.

Anti-Diffusion: Preventing Abuse of Modifications of Diffusion-Based Models

TL;DR

Anti-Diffusion addresses privacy protection against abuse of diffusion-based image generation by defending against both tuning and editing attacks. It introduces a prompt-tuning strategy to better preserve semantics and a semantic-disturbance loss to disrupt semantic signals during editing, all within a three-stage, min–max optimization framework. The approach is validated on DreamBooth/LoRA and editing methods (MasaCtrl/DiffEdit), and on a new Defense-Edit benchmark, demonstrating superior defense performance and robustness across scenarios. The work provides practical, deployable safeguards for diffusion-based generation and edits in real-world settings.

Abstract

Although diffusion-based techniques have shown remarkable success in image generation and editing tasks, their abuse can lead to severe negative social impacts. Recently, some works have been proposed to provide defense against the abuse of diffusion-based methods. However, their protection may be limited in specific scenarios by manually defined prompts or the stable diffusion (SD) version. Furthermore, these methods solely focus on tuning methods, overlooking editing methods that could also pose a significant threat. In this work, we propose Anti-Diffusion, a privacy protection system designed for general diffusion-based methods, applicable to both tuning and editing techniques. To mitigate the limitations of manually defined prompts on defense performance, we introduce the prompt tuning (PT) strategy that enables precise expression of original images. To provide defense against both tuning and editing methods, we propose the semantic disturbance loss (SDL) to disrupt the semantic information of protected images. Given the limited research on the defense against editing methods, we develop a dataset named Defense-Edit to assess the defense performance of various methods. Experiments demonstrate that our Anti-Diffusion achieves superior defense performance across a wide range of diffusion-based techniques in different scenarios.

Paper Structure

This paper contains 27 sections, 9 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Our defense system, called Anti-Diffusion, can provide defense against both tuning and editing methods.
  • Figure 2: The overview framework of Anti-Diffusion under the $j_{th}$ epoch. Here $x_{j}$ represents the image to be protected. In stage (1), the text-embedding $f_{j}$ will undergo fine-tuning with the $\mathcal{L}_{\mathrm{LDM}}$. Subsequently, in stage (2), adversarial noise will be optimized and added to $x_{j}$ using the PGD with our proposed loss functions $\mathcal{L}_{\mathrm{U R L}}$ and $\mathcal{L}_{\mathrm{S D L}}$ to obtain the adversarial sample $\hat{x}_{j}$. In stage (3), the UNet will be updated with $\mathcal{L}_{\mathrm{UNet}}$ using the adversarial sample $\hat{x}_{j}$ and text embedding $\hat{f}_{j}$ to simulate the tuning process of malicious users. This process repeats cyclically, returning to stage (1) in the next epoch.
  • Figure 3: Visualization results of how $\mathcal{L}_{\mathrm{SDL}}$ works. The editing method is DiffEdit.
  • Figure 4: Qualitative defense results of different methods on the DreamBooth model. The specific prompt adopted in DreamBooth is "a photo of sks person". The instance is from VGGFace2.
  • Figure 5: Qualitative defense results of different defense methods on MasaCtrl and DiffEdit. The instance is from our proposed dataset Defense-Edit.