Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models
Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka
TL;DR
Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models proposes a framework that embeds visible watermarks into adversarial outputs to deter diffusion-model imitation of copyrighted content. A conditional GAN generates perturbations conditioned on a watermark, optimized with three losses—adversarial, GAN, and weighted perturbation—to produce perceptually small yet DM-attacking perturbations. The method enables rapid inference after short training on few samples and demonstrates robustness and transferability across text-guided image-to-image tasks, textual inversion, and several DM variants, outperforming prior approaches that rely on chaotic textures or model re-training. The work offers a practical, scalable approach to copyright protection in DM-based content creation with broad applicability to DreamBooth, LoRA, and Custom Diffusion while maintaining image quality.
Abstract
Diffusion Models (DMs) have shown remarkable capabilities in various image-generation tasks. However, there are growing concerns that DMs could be used to imitate unauthorized creations and thus raise copyright issues. To address this issue, we propose a novel framework that embeds personal watermarks in the generation of adversarial examples. Such examples can force DMs to generate images with visible watermarks and prevent DMs from imitating unauthorized images. We construct a generator based on conditional adversarial networks and design three losses (adversarial loss, GAN loss, and perturbation loss) to generate adversarial examples that have subtle perturbation but can effectively attack DMs to prevent copyright violations. Training a generator for a personal watermark by our method only requires 5-10 samples within 2-3 minutes, and once the generator is trained, it can generate adversarial examples with that watermark significantly fast (0.2s per image). We conduct extensive experiments in various conditional image-generation scenarios. Compared to existing methods that generate images with chaotic textures, our method adds visible watermarks on the generated images, which is a more straightforward way to indicate copyright violations. We also observe that our adversarial examples exhibit good transferability across unknown generative models. Therefore, this work provides a simple yet powerful way to protect copyright from DM-based imitation.
