Table of Contents
Fetching ...

Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples

Chumeng Liang, Xiaoyu Wu, Yang Hua, Jiaru Zhang, Yiming Xue, Tao Song, Zhengui Xue, Ruhui Ma, Haibing Guan

TL;DR

This work tackles the copyright risks of diffusion-model–driven AI art by introducing a formal framework for adversarial examples in diffusion models and proposing AdvDM, a Monte-Carlo–based attack that perturbs inputs to disrupt the conditioning features used for generation. By iteratively optimizing over latent trajectories, AdvDM degrades the quality of conditionally generated images, increasing Fréchet Inception Distance and reducing Precision across text-to-image, style transfer, and image-to-image tasks. The approach is validated on Latent Diffusion Models and shows robustness against several preprocessing defenses, offering a practical tool for artists to protect their works from unauthorized imitation. The study highlights ethical considerations and outlines future directions for strengthening copyright protection in AI-for-Art ecosystems.

Abstract

Recently, Diffusion Models (DMs) boost a wave in AI for Art yet raise new copyright concerns, where infringers benefit from using unauthorized paintings to train DMs to generate novel paintings in a similar style. To address these emerging copyright violations, in this paper, we are the first to explore and propose to utilize adversarial examples for DMs to protect human-created artworks. Specifically, we first build a theoretical framework to define and evaluate the adversarial examples for DMs. Then, based on this framework, we design a novel algorithm, named AdvDM, which exploits a Monte-Carlo estimation of adversarial examples for DMs by optimizing upon different latent variables sampled from the reverse process of DMs. Extensive experiments show that the generated adversarial examples can effectively hinder DMs from extracting their features. Therefore, our method can be a powerful tool for human artists to protect their copyright against infringers equipped with DM-based AI-for-Art applications. The code of our method is available on GitHub: https://github.com/mist-project/mist.git.

Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples

TL;DR

This work tackles the copyright risks of diffusion-model–driven AI art by introducing a formal framework for adversarial examples in diffusion models and proposing AdvDM, a Monte-Carlo–based attack that perturbs inputs to disrupt the conditioning features used for generation. By iteratively optimizing over latent trajectories, AdvDM degrades the quality of conditionally generated images, increasing Fréchet Inception Distance and reducing Precision across text-to-image, style transfer, and image-to-image tasks. The approach is validated on Latent Diffusion Models and shows robustness against several preprocessing defenses, offering a practical tool for artists to protect their works from unauthorized imitation. The study highlights ethical considerations and outlines future directions for strengthening copyright protection in AI-for-Art ecosystems.

Abstract

Recently, Diffusion Models (DMs) boost a wave in AI for Art yet raise new copyright concerns, where infringers benefit from using unauthorized paintings to train DMs to generate novel paintings in a similar style. To address these emerging copyright violations, in this paper, we are the first to explore and propose to utilize adversarial examples for DMs to protect human-created artworks. Specifically, we first build a theoretical framework to define and evaluate the adversarial examples for DMs. Then, based on this framework, we design a novel algorithm, named AdvDM, which exploits a Monte-Carlo estimation of adversarial examples for DMs by optimizing upon different latent variables sampled from the reverse process of DMs. Extensive experiments show that the generated adversarial examples can effectively hinder DMs from extracting their features. Therefore, our method can be a powerful tool for human artists to protect their copyright against infringers equipped with DM-based AI-for-Art applications. The code of our method is available on GitHub: https://github.com/mist-project/mist.git.
Paper Structure (33 sections, 11 equations, 21 figures, 5 tables, 6 algorithms)

This paper contains 33 sections, 11 equations, 21 figures, 5 tables, 6 algorithms.

Figures (21)

  • Figure 1: Comparison of workflows for adversarial examples in classification models and diffusion models. Adversarial examples in diffusion models prevent diffusion models from extracting image features as conditions by inducing out-of-distribution features. The feature extracting shown in the figure is textual inversion gal2022image in DMs, which has raised copyright concerns in several cases kimjungginewsmimicnews.
  • Figure 2: Comparison of generated image quality in style transfer for categories of WikiArt wikiart. Images shown in each group share the same source image. We use textual inversion gal2022image to extract the style of training samples from WikiArt, shown in a separate column. For each group, the top row shows the generated images based on the style extracted from the clean examples. The bottom row shows the generated images based on the style extracted from the adversarial examples. Strength is a hyper-parameter that indicates how much the style of the source image is covered by the target style. LDM fails to capture the style from adversarial examples, compared to clean images.
  • Figure 3: Comparison of images conditionally generated in the image-to-image generation. With conditions extracted from our adversarial examples, LDM generates unrealistic images.
  • Figure 4: (a) The FID and sampling steps for AdvDM. (b) The Precision and sampling steps for AdvDM.
  • Figure 5: Visualization of conditionally-generated images based on different training images. All defenses cannot perfectly maintain the image quality under AdvDM.
  • ...and 16 more figures

Theorems & Definitions (2)

  • Definition 3.1: Adversarial Example for Diffusion Models
  • Definition 3.1: Adversarial Example for Diffusion Models (with Embedding Attack)