Table of Contents
Fetching ...

Anti-Reference: Universal and Immediate Defense Against Reference-Based Generation

Yiren Song, Shengtao Lou, Xiaokang Liu, Hai Ci, Pei Yang, Jiaming Liu, Mike Zheng Shou

TL;DR

This work introduces Anti-Reference, a universal defense against reference-based diffusion generation by adding imperceptible adversarial noise via a ViT-based Adversarial Noise Encoder and a unified loss that jointly attacks fine-tuning-based and tuning-free methods as well as human-driven generation. The method leverages differentiable augmentation and white-box proxy models to enable gray-box transfer to commercial APIs, achieving strong protection with high efficiency through a dedicated PGD and ANE framework. Comprehensive experiments across seven customization tasks, using metrics like ISM, Aesthetic Score, and CLIP-IQA, demonstrate superior protection and practical performance, including notable gray-box transfer. Limitations include reliance on SD1.5 architectures, with future work extending to other diffusion variants and improving stealth of adversarial noise for broader real-world applicability.

Abstract

Diffusion models have revolutionized generative modeling with their exceptional ability to produce high-fidelity images. However, misuse of such potent tools can lead to the creation of fake news or disturbing content targeting individuals, resulting in significant social harm. In this paper, we introduce Anti-Reference, a novel method that protects images from the threats posed by reference-based generation techniques by adding imperceptible adversarial noise to the images. We propose a unified loss function that enables joint attacks on fine-tuning-based customization methods, non-fine-tuning customization methods, and human-centric driving methods. Based on this loss, we train a Adversarial Noise Encoder to predict the noise or directly optimize the noise using the PGD method. Our method shows certain transfer attack capabilities, effectively challenging both gray-box models and some commercial APIs. Extensive experiments validate the performance of Anti-Reference, establishing a new benchmark in image security.

Anti-Reference: Universal and Immediate Defense Against Reference-Based Generation

TL;DR

This work introduces Anti-Reference, a universal defense against reference-based diffusion generation by adding imperceptible adversarial noise via a ViT-based Adversarial Noise Encoder and a unified loss that jointly attacks fine-tuning-based and tuning-free methods as well as human-driven generation. The method leverages differentiable augmentation and white-box proxy models to enable gray-box transfer to commercial APIs, achieving strong protection with high efficiency through a dedicated PGD and ANE framework. Comprehensive experiments across seven customization tasks, using metrics like ISM, Aesthetic Score, and CLIP-IQA, demonstrate superior protection and practical performance, including notable gray-box transfer. Limitations include reliance on SD1.5 architectures, with future work extending to other diffusion variants and improving stealth of adversarial noise for broader real-world applicability.

Abstract

Diffusion models have revolutionized generative modeling with their exceptional ability to produce high-fidelity images. However, misuse of such potent tools can lead to the creation of fake news or disturbing content targeting individuals, resulting in significant social harm. In this paper, we introduce Anti-Reference, a novel method that protects images from the threats posed by reference-based generation techniques by adding imperceptible adversarial noise to the images. We propose a unified loss function that enables joint attacks on fine-tuning-based customization methods, non-fine-tuning customization methods, and human-centric driving methods. Based on this loss, we train a Adversarial Noise Encoder to predict the noise or directly optimize the noise using the PGD method. Our method shows certain transfer attack capabilities, effectively challenging both gray-box models and some commercial APIs. Extensive experiments validate the performance of Anti-Reference, establishing a new benchmark in image security.

Paper Structure

This paper contains 23 sections, 6 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Malicious attackers can collect users' images as reference images and use diffusion models to achieve malicious purposes. Our system, called Anti-reference, applies imperceptible perturbations to user-uploaded images before they are published, resulting in noticeable artifacts in images or videos generated by reference-based methods and fine-tuning approaches. This makes it easy to recognize them as AI-generated, thus protecting the images.
  • Figure 2: Illustration of Anti-reference. We propose a loss function to protect images from the threats of customized generation methods, and we use this loss to train a noise encoder to predict adversarial noise.
  • Figure 3: Results of different image protection methods in safeguarding images from the threats of customized generation tasks.
  • Figure 4: Qualitative Evaluation of Method Robustness. Our method is Robustness under prompt mismatch and image transformation.
  • Figure 5: Gray-box attack results on Tongyi APIs.
  • ...and 3 more figures