Table of Contents
Fetching ...

Copyright Infringement Detection in Text-to-Image Diffusion Models via Differential Privacy

Xiafeng Man, Zhipeng Wei, Jingjing Chen

TL;DR

The work tackles copyright infringement in text-to-image diffusion models by formalizing a conditional differential privacy framework and introducing a measurable conditional sensitivity metric. It operationalizes infringement detection via D-Plus-Minus (DPM), a dual-branch fine-tuning approach that learns and unlearns a target concept and compares outputs with CLIP-based embeddings to detect memorization. The authors validate the method on the CIDD benchmark across multiple models, demonstrating robust AUCs and interpretability, and provide ablations, robustness analyses, and implementation details. This DP-grounded, post-hoc detector offers a principled, practical tool for safeguarding IP in generative AI, with broad potential extension to LVLMs and LLMs.

Abstract

The widespread deployment of large vision models such as Stable Diffusion raises significant legal and ethical concerns, as these models can memorize and reproduce copyrighted content without authorization. Existing detection approaches often lack robustness and fail to provide rigorous theoretical underpinnings. To address these gaps, we formalize the concept of copyright infringement and its detection from the perspective of Differential Privacy (DP), and introduce the conditional sensitivity metric, a concept analogous to sensitivity in DP, that quantifies the deviation in a diffusion model's output caused by the inclusion or exclusion of a specific training data point. To operationalize this metric, we propose D-Plus-Minus (DPM), a novel post-hoc detection framework that identifies copyright infringement in text-to-image diffusion models. Specifically, DPM simulates inclusion and exclusion processes by fine-tuning models in two opposing directions: learning or unlearning. Besides, to disentangle concept-specific influence from the global parameter shifts induced by fine-tuning, DPM computes confidence scores over orthogonal prompt distributions using statistical metrics. Moreover, to facilitate standardized benchmarking, we also construct the Copyright Infringement Detection Dataset (CIDD), a comprehensive resource for evaluating detection across diverse categories. Our results demonstrate that DPM reliably detects infringement content without requiring access to the original training dataset or text prompts, offering an interpretable and practical solution for safeguarding intellectual property in the era of generative AI.

Copyright Infringement Detection in Text-to-Image Diffusion Models via Differential Privacy

TL;DR

The work tackles copyright infringement in text-to-image diffusion models by formalizing a conditional differential privacy framework and introducing a measurable conditional sensitivity metric. It operationalizes infringement detection via D-Plus-Minus (DPM), a dual-branch fine-tuning approach that learns and unlearns a target concept and compares outputs with CLIP-based embeddings to detect memorization. The authors validate the method on the CIDD benchmark across multiple models, demonstrating robust AUCs and interpretability, and provide ablations, robustness analyses, and implementation details. This DP-grounded, post-hoc detector offers a principled, practical tool for safeguarding IP in generative AI, with broad potential extension to LVLMs and LLMs.

Abstract

The widespread deployment of large vision models such as Stable Diffusion raises significant legal and ethical concerns, as these models can memorize and reproduce copyrighted content without authorization. Existing detection approaches often lack robustness and fail to provide rigorous theoretical underpinnings. To address these gaps, we formalize the concept of copyright infringement and its detection from the perspective of Differential Privacy (DP), and introduce the conditional sensitivity metric, a concept analogous to sensitivity in DP, that quantifies the deviation in a diffusion model's output caused by the inclusion or exclusion of a specific training data point. To operationalize this metric, we propose D-Plus-Minus (DPM), a novel post-hoc detection framework that identifies copyright infringement in text-to-image diffusion models. Specifically, DPM simulates inclusion and exclusion processes by fine-tuning models in two opposing directions: learning or unlearning. Besides, to disentangle concept-specific influence from the global parameter shifts induced by fine-tuning, DPM computes confidence scores over orthogonal prompt distributions using statistical metrics. Moreover, to facilitate standardized benchmarking, we also construct the Copyright Infringement Detection Dataset (CIDD), a comprehensive resource for evaluating detection across diverse categories. Our results demonstrate that DPM reliably detects infringement content without requiring access to the original training dataset or text prompts, offering an interpretable and practical solution for safeguarding intellectual property in the era of generative AI.

Paper Structure

This paper contains 68 sections, 1 theorem, 24 equations, 9 figures, 10 tables.

Key Result

Proposition 1

Let $x$ be a data point (e.g., a prompt-image pair) that does not appear in any subset of the training dataset $D$ or its neighbouring dataset $D'$. Then, under Def. def: CDP-T2I, the generative model $G$ satisfies $(0, 0)$-conditional differential privacy with respect to $x$ for any prompt $p$.

Figures (9)

  • Figure 1: D-Plus-Minus Method. Given the neighbourhood images $U(x_i)$, i.e., several images of similar semantics extracted from the target image, of the target image $x_i$ as the training subset, we fine-tune the text-to-image model $G$ towards two branch: learning branch $G_{D^+}$ and unlearning branch $G_{D^-}$. Experimental results show that infringed samples lead to a significant shift in sensitivity metric, whereas non-infringed samples only cause minor changes.
  • Figure 2: Detection Procedure of Copyright Infringement. Firstly, we extract a concept from the target image. Next, we collect several images associated with this concept to form a neighborhood subset, and construct a prompt using a unique identifier (e.g., a photo of [V] person). Finally, we feed these image-prompt pairs into the D-Plus-Minus platform to compute a DPM score with statistical guarantee.
  • Figure 3: Qualitative visualization of two branches across different timesteps. Models tend to learn and unlearn faster with infringed samples, while slower on non-infringed ones, and cannot learn exact elements in the target images.
  • Figure 4: ROC curves in four representative models. The proposed DPM confidence score consistently outperforms individual branches in terms of AUC, demonstrating its superior capability in distinguishing infringed from non-infringed samples.
  • Figure 5: Conditional Sensitivity in SD1.4. Infringed samples are more sensitive to the model outputs behavior.
  • ...and 4 more figures

Theorems & Definitions (10)

  • Definition 1: Differential Privacy
  • Definition 2: Conditional Publicity for Diffusion Models
  • Definition 3: Copyright Infringement
  • Definition 4: Copyright Non-Infringement
  • Definition 5: Detection of Copyright Infringement
  • Definition 6: Conditional Differential Privacy for Generative Models
  • Proposition 1
  • proof
  • Definition 7: Approximate Relative Privacy
  • Definition 8: Approximate Copyright Non-Infringement