Copyright Infringement Detection in Text-to-Image Diffusion Models via Differential Privacy
Xiafeng Man, Zhipeng Wei, Jingjing Chen
TL;DR
The work tackles copyright infringement in text-to-image diffusion models by formalizing a conditional differential privacy framework and introducing a measurable conditional sensitivity metric. It operationalizes infringement detection via D-Plus-Minus (DPM), a dual-branch fine-tuning approach that learns and unlearns a target concept and compares outputs with CLIP-based embeddings to detect memorization. The authors validate the method on the CIDD benchmark across multiple models, demonstrating robust AUCs and interpretability, and provide ablations, robustness analyses, and implementation details. This DP-grounded, post-hoc detector offers a principled, practical tool for safeguarding IP in generative AI, with broad potential extension to LVLMs and LLMs.
Abstract
The widespread deployment of large vision models such as Stable Diffusion raises significant legal and ethical concerns, as these models can memorize and reproduce copyrighted content without authorization. Existing detection approaches often lack robustness and fail to provide rigorous theoretical underpinnings. To address these gaps, we formalize the concept of copyright infringement and its detection from the perspective of Differential Privacy (DP), and introduce the conditional sensitivity metric, a concept analogous to sensitivity in DP, that quantifies the deviation in a diffusion model's output caused by the inclusion or exclusion of a specific training data point. To operationalize this metric, we propose D-Plus-Minus (DPM), a novel post-hoc detection framework that identifies copyright infringement in text-to-image diffusion models. Specifically, DPM simulates inclusion and exclusion processes by fine-tuning models in two opposing directions: learning or unlearning. Besides, to disentangle concept-specific influence from the global parameter shifts induced by fine-tuning, DPM computes confidence scores over orthogonal prompt distributions using statistical metrics. Moreover, to facilitate standardized benchmarking, we also construct the Copyright Infringement Detection Dataset (CIDD), a comprehensive resource for evaluating detection across diverse categories. Our results demonstrate that DPM reliably detects infringement content without requiring access to the original training dataset or text prompts, offering an interpretable and practical solution for safeguarding intellectual property in the era of generative AI.
