Table of Contents
Fetching ...

DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models

Jing Yang, Runping Xi, Yingxin Lai, Xun Lin, Zitong Yu

TL;DR

DDAP addresses privacy risks in text-to-image diffusion personalization by introducing a dual-domain defense that perturbs both spatial and frequency information. The Spatial Perturbation Learning (SPL) targets the fixed image encoder, while the Frequency Perturbation Learning (FPL) disrupts high-frequency details, and a Localization Module focuses perturbations on personalized concept regions. The combined DDPL framework and DAAM-based localization yield strong disruption of personalized model learning while preserving input and generated image quality. Experimental results on facial datasets show DDAP surpassing existing protections in key metrics, offering a practical approach to mitigate misuse of diffusion-based personalization.

Abstract

Diffusion-based personalized visual content generation technologies have achieved significant breakthroughs, allowing for the creation of specific objects by just learning from a few reference photos. However, when misused to fabricate fake news or unsettling content targeting individuals, these technologies could cause considerable societal harm. To address this problem, current methods generate adversarial samples by adversarially maximizing the training loss, thereby disrupting the output of any personalized generation model trained with these samples. However, the existing methods fail to achieve effective defense and maintain stealthiness, as they overlook the intrinsic properties of diffusion models. In this paper, we introduce a novel Dual-Domain Anti-Personalization framework (DDAP). Specifically, we have developed Spatial Perturbation Learning (SPL) by exploiting the fixed and perturbation-sensitive nature of the image encoder in personalized generation. Subsequently, we have designed a Frequency Perturbation Learning (FPL) method that utilizes the characteristics of diffusion models in the frequency domain. The SPL disrupts the overall texture of the generated images, while the FPL focuses on image details. By alternating between these two methods, we construct the DDAP framework, effectively harnessing the strengths of both domains. To further enhance the visual quality of the adversarial samples, we design a localization module to accurately capture attentive areas while ensuring the effectiveness of the attack and avoiding unnecessary disturbances in the background. Extensive experiments on facial benchmarks have shown that the proposed DDAP enhances the disruption of personalized generation models while also maintaining high quality in adversarial samples, making it more effective in protecting privacy in practical applications.

DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models

TL;DR

DDAP addresses privacy risks in text-to-image diffusion personalization by introducing a dual-domain defense that perturbs both spatial and frequency information. The Spatial Perturbation Learning (SPL) targets the fixed image encoder, while the Frequency Perturbation Learning (FPL) disrupts high-frequency details, and a Localization Module focuses perturbations on personalized concept regions. The combined DDPL framework and DAAM-based localization yield strong disruption of personalized model learning while preserving input and generated image quality. Experimental results on facial datasets show DDAP surpassing existing protections in key metrics, offering a practical approach to mitigate misuse of diffusion-based personalization.

Abstract

Diffusion-based personalized visual content generation technologies have achieved significant breakthroughs, allowing for the creation of specific objects by just learning from a few reference photos. However, when misused to fabricate fake news or unsettling content targeting individuals, these technologies could cause considerable societal harm. To address this problem, current methods generate adversarial samples by adversarially maximizing the training loss, thereby disrupting the output of any personalized generation model trained with these samples. However, the existing methods fail to achieve effective defense and maintain stealthiness, as they overlook the intrinsic properties of diffusion models. In this paper, we introduce a novel Dual-Domain Anti-Personalization framework (DDAP). Specifically, we have developed Spatial Perturbation Learning (SPL) by exploiting the fixed and perturbation-sensitive nature of the image encoder in personalized generation. Subsequently, we have designed a Frequency Perturbation Learning (FPL) method that utilizes the characteristics of diffusion models in the frequency domain. The SPL disrupts the overall texture of the generated images, while the FPL focuses on image details. By alternating between these two methods, we construct the DDAP framework, effectively harnessing the strengths of both domains. To further enhance the visual quality of the adversarial samples, we design a localization module to accurately capture attentive areas while ensuring the effectiveness of the attack and avoiding unnecessary disturbances in the background. Extensive experiments on facial benchmarks have shown that the proposed DDAP enhances the disruption of personalized generation models while also maintaining high quality in adversarial samples, making it more effective in protecting privacy in practical applications.
Paper Structure (17 sections, 11 equations, 6 figures, 5 tables)

This paper contains 17 sections, 11 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Illustration of adversarial examples generated by the state-of-the-art (SOTA) Anti-DB vanleAntiDreamBoothProtectingUsers2023, and our method. The first row displays the original images, the second row features adversarial samples produced by Anti-DB, and the third row showcases adversarial samples from our method. Both methods operate within a noise budget of 12/255.
  • Figure 2: Overall architecture: To integrate the advantages of both perturbation learning methods, we sequentially calculate gradients from the two domains and update the perturbations after filtering through the localization module. We then switch the order of domains. After several iterations, the adversarial samples accumulate gradients from both domains, resulting in improved defense capabilities and maintained stealthiness.
  • Figure 3: The pipeline of Spatial Perturbation Learning (SPL). This method uses gradients derived from the latent loss of the image and the reconstruction loss of the diffusion model to progressively adjust the generated noise.
  • Figure 4: The Frequency Perturbation Learning (FPL) pipeline begins by splitting the input image into blocks and transforming each into the frequency domain using the DCT. Perturbations are added, and then the blocks are converted back to the spatial domain using the IDCT and merged to form the adversarial example. In each iteration, we calculate the adversarial loss and update the perturbations accordingly.
  • Figure 5: The results of the localization module. The first row displays the original image, while the second row visualizes the areas of focus for new concepts learned by the personalized model during the fine-tuning process.
  • ...and 1 more figures