Table of Contents
Fetching ...

Privacy Protection in Personalized Diffusion Models via Targeted Cross-Attention Adversarial Attack

Xide Xu, Muhammad Atif Butt, Sandesh Kamath, Bogdan Raducanu

TL;DR

A novel and efficient adversarial attack method, Concept Protection by Selective Attention Manipulation (CoPSAM) which targets only the cross-attention layers of a T2I diffusion model and protects the content from unauthorized use thereby protecting the individual's identity from potential misuse.

Abstract

The growing demand for customized visual content has led to the rise of personalized text-to-image (T2I) diffusion models. Despite their remarkable potential, they pose significant privacy risk when misused for malicious purposes. In this paper, we propose a novel and efficient adversarial attack method, Concept Protection by Selective Attention Manipulation (CoPSAM) which targets only the cross-attention layers of a T2I diffusion model. For this purpose, we carefully construct an imperceptible noise to be added to clean samples to get their adversarial counterparts. This is obtained during the fine-tuning process by maximizing the discrepancy between the corresponding cross-attention maps of the user-specific token and the class-specific token, respectively. Experimental validation on a subset of CelebA-HQ face images dataset demonstrates that our approach outperforms existing methods. Besides this, our method presents two important advantages derived from the qualitative evaluation: (i) we obtain better protection results for lower noise levels than our competitors; and (ii) we protect the content from unauthorized use thereby protecting the individual's identity from potential misuse.

Privacy Protection in Personalized Diffusion Models via Targeted Cross-Attention Adversarial Attack

TL;DR

A novel and efficient adversarial attack method, Concept Protection by Selective Attention Manipulation (CoPSAM) which targets only the cross-attention layers of a T2I diffusion model and protects the content from unauthorized use thereby protecting the individual's identity from potential misuse.

Abstract

The growing demand for customized visual content has led to the rise of personalized text-to-image (T2I) diffusion models. Despite their remarkable potential, they pose significant privacy risk when misused for malicious purposes. In this paper, we propose a novel and efficient adversarial attack method, Concept Protection by Selective Attention Manipulation (CoPSAM) which targets only the cross-attention layers of a T2I diffusion model. For this purpose, we carefully construct an imperceptible noise to be added to clean samples to get their adversarial counterparts. This is obtained during the fine-tuning process by maximizing the discrepancy between the corresponding cross-attention maps of the user-specific token and the class-specific token, respectively. Experimental validation on a subset of CelebA-HQ face images dataset demonstrates that our approach outperforms existing methods. Besides this, our method presents two important advantages derived from the qualitative evaluation: (i) we obtain better protection results for lower noise levels than our competitors; and (ii) we protect the content from unauthorized use thereby protecting the individual's identity from potential misuse.

Paper Structure

This paper contains 13 sections, 5 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of our method .
  • Figure 2: Illustration of cross-attention maps from clean Custom Diffusion (a) and (b). The significant difference on token "<v>" between our method and unprotected result indicates that our method effectively prevents model from focusing on the areas it should have been targeting.
  • Figure 3: Comparison of images generated using different attack methods with the same noise budget of $\eta = 8/255$. From top to bottom, each group shows the results of the different methods. Clean CD refers to clean Custom Diffusion.
  • Figure 4: Visualizations of identity images, attention maps, and generated images before and after protection using various attack methods. Clean CD refers to clean Custom Diffusion.
  • Figure 5: Visualizations across varying noise budgets for ("*” - indicates default budget).