Table of Contents
Fetching ...

Image Copy Detection for Diffusion Models

Wenhao Wang, Yifan Sun, Zhentao Tan, Yi Yang

TL;DR

Diffusion-model content replication poses a challenge beyond traditional ICD approaches that rely on handcrafted image transformations. The authors present ICDiff, featuring the D-Rep dataset of $40{,}000$ image–replica pairs labeled across six replication levels $0$–$5$, and a PDF-Embedding method that converts replication levels into probability density functions and learns image representations via ViT-based subspaces. Training uses KL divergence between target PDFs and model predictions, with testing yielding a normalized replication level. Empirical results show PDF-Embedding outperforming protocol-driven baselines and revealing diffusion-model replication in the $10 ext{-}20\%$ range across six models, suggesting a practical tool for copyright and originality analyses in diffusion-generated content. The work provides a foundation for diffusion-specific ICD research and practical guidelines for evaluating replication in large image galleries.

Abstract

Images produced by diffusion models are increasingly popular in digital artwork and visual marketing. However, such generated images might replicate content from existing ones and pose the challenge of content originality. Existing Image Copy Detection (ICD) models, though accurate in detecting hand-crafted replicas, overlook the challenge from diffusion models. This motivates us to introduce ICDiff, the first ICD specialized for diffusion models. To this end, we construct a Diffusion-Replication (D-Rep) dataset and correspondingly propose a novel deep embedding method. D-Rep uses a state-of-the-art diffusion model (Stable Diffusion V1.5) to generate 40, 000 image-replica pairs, which are manually annotated into 6 replication levels ranging from 0 (no replication) to 5 (total replication). Our method, PDF-Embedding, transforms the replication level of each image-replica pair into a probability density function (PDF) as the supervision signal. The intuition is that the probability of neighboring replication levels should be continuous and smooth. Experimental results show that PDF-Embedding surpasses protocol-driven methods and non-PDF choices on the D-Rep test set. Moreover, by utilizing PDF-Embedding, we find that the replication ratios of well-known diffusion models against an open-source gallery range from 10% to 20%. The project is publicly available at https://icdiff.github.io/.

Image Copy Detection for Diffusion Models

TL;DR

Diffusion-model content replication poses a challenge beyond traditional ICD approaches that rely on handcrafted image transformations. The authors present ICDiff, featuring the D-Rep dataset of image–replica pairs labeled across six replication levels , and a PDF-Embedding method that converts replication levels into probability density functions and learns image representations via ViT-based subspaces. Training uses KL divergence between target PDFs and model predictions, with testing yielding a normalized replication level. Empirical results show PDF-Embedding outperforming protocol-driven baselines and revealing diffusion-model replication in the range across six models, suggesting a practical tool for copyright and originality analyses in diffusion-generated content. The work provides a foundation for diffusion-specific ICD research and practical guidelines for evaluating replication in large image galleries.

Abstract

Images produced by diffusion models are increasingly popular in digital artwork and visual marketing. However, such generated images might replicate content from existing ones and pose the challenge of content originality. Existing Image Copy Detection (ICD) models, though accurate in detecting hand-crafted replicas, overlook the challenge from diffusion models. This motivates us to introduce ICDiff, the first ICD specialized for diffusion models. To this end, we construct a Diffusion-Replication (D-Rep) dataset and correspondingly propose a novel deep embedding method. D-Rep uses a state-of-the-art diffusion model (Stable Diffusion V1.5) to generate 40, 000 image-replica pairs, which are manually annotated into 6 replication levels ranging from 0 (no replication) to 5 (total replication). Our method, PDF-Embedding, transforms the replication level of each image-replica pair into a probability density function (PDF) as the supervision signal. The intuition is that the probability of neighboring replication levels should be continuous and smooth. Experimental results show that PDF-Embedding surpasses protocol-driven methods and non-PDF choices on the D-Rep test set. Moreover, by utilizing PDF-Embedding, we find that the replication ratios of well-known diffusion models against an open-source gallery range from 10% to 20%. The project is publicly available at https://icdiff.github.io/.
Paper Structure (24 sections, 41 equations, 23 figures, 3 tables)

This paper contains 24 sections, 41 equations, 23 figures, 3 tables.

Figures (23)

  • Figure 1: Some generated images (top) from diffusion models replicates the contents of existing images (bottom). The existing (matched) images are from LAION-Aesthetics schuhmann2022laion. The diffusion models include both commercial and open-source ones.
  • Figure 2: The comparison between current ICD with the ICDiff. The current ICD focuses on detecting edited copies generated by transformations like horizontal flips, random rotations, and random crops. In contrast, the ICDiff aims to detect replication generated by diffusion models, such as Stable Diffusion rombach2021highresolution. (Source of the original image: https://www.theverge.com/2023/2/6/23587393/ai-art-copyright-lawsuit-getty-images-stable-diffusion)
  • Figure 3: The demonstration of the manual-labeled D-Rep dataset. The percentages on the left show the proportion of images with a particular level.
  • Figure 4: The demonstration of the proposed PDF-Embedding. Initially, PDF-Embedding converts manually-labeled replication levels into probability density functions (PDFs). To learn from these PDFs, we use a set of vectors as the representation of an image.
  • Figure 5: The comparison of different PDFs: Gaussian (left), linear (middle), and exponential (right). "$A$" is the amplitude in each PDF function (Eqn. \ref{['Eq: g']} to Eqn. \ref{['Eq: e']}).
  • ...and 18 more figures