A Watermark-Conditioned Diffusion Model for IP Protection
Rui Min, Sen Li, Hongyang Chen, Minhao Cheng
TL;DR
WaDiff introduces a watermark-conditioned diffusion model that embeds user-specific fingerprints directly into the generation process, enabling both detection of AI-generated content and identification of the generating user in a black-box API setting. The framework relies on two losses—message retrieval and consistency—and selective end-to-end fine-tuning of a small head to maintain image quality while injecting watermark information. Experimental results demonstrate strong detection and tracing performance across large user pools, with high SSIM and minimal FID impact, and robustness to common augmentations and adaptive attacks. This approach offers scalable, stealthy IP protection for diffusion-based content, facilitating forensic attribution and governance in real-world deployments.
Abstract
The ethical need to protect AI-generated content has been a significant concern in recent years. While existing watermarking strategies have demonstrated success in detecting synthetic content (detection), there has been limited exploration in identifying the users responsible for generating these outputs from a single model (owner identification). In this paper, we focus on both practical scenarios and propose a unified watermarking framework for content copyright protection within the context of diffusion models. Specifically, we consider two parties: the model provider, who grants public access to a diffusion model via an API, and the users, who can solely query the model API and generate images in a black-box manner. Our task is to embed hidden information into the generated contents, which facilitates further detection and owner identification. To tackle this challenge, we propose a Watermark-conditioned Diffusion model called WaDiff, which manipulates the watermark as a conditioned input and incorporates fingerprinting into the generation process. All the generative outputs from our WaDiff carry user-specific information, which can be recovered by an image extractor and further facilitate forensic identification. Extensive experiments are conducted on two popular diffusion models, and we demonstrate that our method is effective and robust in both the detection and owner identification tasks. Meanwhile, our watermarking framework only exerts a negligible impact on the original generation and is more stealthy and efficient in comparison to existing watermarking strategies.
