Origin Identification for Text-Guided Image-to-Image Diffusion Models

Wenhao Wang; Yifan Sun; Zongxin Yang; Zhentao Tan; Zhengdong Hu; Yi Yang

Origin Identification for Text-Guided Image-to-Image Diffusion Models

Wenhao Wang, Yifan Sun, Zongxin Yang, Zhentao Tan, Zhengdong Hu, Yi Yang

TL;DR

The paper tackles misinformation and attribution risks from text-guided image-to-image diffusion models by defining the origin identification task ($ID^2$) and releasing the OriPID dataset. It proves a linear transformation $\mathbf{W}$ on VAE embeddings can align an original image and its translation, and shows this transformation generalizes across diffusion models. Empirically, the approach dramatically outperforms similarity-based baselines on unseen models (high mAP gains) while offering efficiency and robustness, with strong results on real-world edited data. This work provides a practical, theoretically grounded path toward tracing origins of AI-generated images, with implications for copyright enforcement and content verification.

Abstract

Text-guided image-to-image diffusion models excel in translating images based on textual prompts, allowing for precise and creative visual modifications. However, such a powerful technique can be misused for spreading misinformation, infringing on copyrights, and evading content tracing. This motivates us to introduce the task of origin IDentification for text-guided Image-to-image Diffusion models (ID$^2$), aiming to retrieve the original image of a given translated query. A straightforward solution to ID$^2$ involves training a specialized deep embedding model to extract and compare features from both query and reference images. However, due to visual discrepancy across generations produced by different diffusion models, this similarity-based approach fails when training on images from one model and testing on those from another, limiting its effectiveness in real-world applications. To solve this challenge of the proposed ID$^2$ task, we contribute the first dataset and a theoretically guaranteed method, both emphasizing generalizability. The curated dataset, OriPID, contains abundant Origins and guided Prompts, which can be used to train and test potential IDentification models across various diffusion models. In the method section, we first prove the existence of a linear transformation that minimizes the distance between the pre-trained Variational Autoencoder (VAE) embeddings of generated samples and their origins. Subsequently, it is demonstrated that such a simple linear transformation can be generalized across different diffusion models. Experimental results show that the proposed method achieves satisfying generalization performance, significantly surpassing similarity-based methods ($+31.6\%$ mAP), even those with generalization designs. The project is available at https://id2icml.github.io.

Origin Identification for Text-Guided Image-to-Image Diffusion Models

TL;DR

The paper tackles misinformation and attribution risks from text-guided image-to-image diffusion models by defining the origin identification task (

) and releasing the OriPID dataset. It proves a linear transformation

on VAE embeddings can align an original image and its translation, and shows this transformation generalizes across diffusion models. Empirically, the approach dramatically outperforms similarity-based baselines on unseen models (high mAP gains) while offering efficiency and robustness, with strong results on real-world edited data. This work provides a practical, theoretically grounded path toward tracing origins of AI-generated images, with implications for copyright enforcement and content verification.

Abstract

), aiming to retrieve the original image of a given translated query. A straightforward solution to ID

involves training a specialized deep embedding model to extract and compare features from both query and reference images. However, due to visual discrepancy across generations produced by different diffusion models, this similarity-based approach fails when training on images from one model and testing on those from another, limiting its effectiveness in real-world applications. To solve this challenge of the proposed ID

task, we contribute the first dataset and a theoretically guaranteed method, both emphasizing generalizability. The curated dataset, OriPID, contains abundant Origins and guided Prompts, which can be used to train and test potential IDentification models across various diffusion models. In the method section, we first prove the existence of a linear transformation that minimizes the distance between the pre-trained Variational Autoencoder (VAE) embeddings of generated samples and their origins. Subsequently, it is demonstrated that such a simple linear transformation can be generalized across different diffusion models. Experimental results show that the proposed method achieves satisfying generalization performance, significantly surpassing similarity-based methods (

mAP), even those with generalization designs. The project is available at https://id2icml.github.io.

Paper Structure (21 sections, 22 equations, 13 figures, 10 tables)

This paper contains 21 sections, 22 equations, 13 figures, 10 tables.

Introduction
Related Works
Dataset
Method
Existence
Generalizability
Implementation
Experiments
Protocols and Details
The Challenge from ID$^2$
VAE differs between Seen and Unseen Models
The Effectiveness of our Method
Ablation Study
Conclusion
Proofs of Lemmas
...and 6 more sections

Figures (13)

Figure 1: The illustration for misusing text-guided image-to-image diffusion models in several scenarios: misinformation, copyright infringement, and evading content tracing. Specifically: (a) An altered image originally showing Donald Trump post-assassination is edited to depict Joe Biden instead; (b) The removal of a watermark from a copyrighted beach image, followed by modifications, could assist in escaping copyright checks; (c) An image of a Norwegian government building after an explosion is altered to bypass restrictions, which limit the spread of disturbing images.
Figure 2: The demonstration for visual discrepancy between generated images by different diffusion models. The images generated by various models exhibit distinctive visual features such as realistic textures, complex architectures, life-like details, vibrant colors, abstract expression, magical ambiance, and photorealistic elements.
Figure 3: The images in our dataset, which is diverse and comprehensive. Specifically, it encompasses a variety of subjects commonly found in real-world scenarios where issues such as misinformation, copyright infringement, and content tracing evasion occur. For instance, our dataset includes images of nature, architecture, animals, planes, art, and indoor. Note that for simplicity, we omit the prompts here. Please refer to Appendix (Section \ref{['App: prompt']}) for examples of prompts and generations.
Figure 4: The implementation of learning theoretical-expected matrix $\mathbf{W}$. Specifically, in practice, we use gradient descent to optimize a metric loss in order to learn $\mathbf{W}$.
Figure 5: Examples of failure cases for each kind of model.
...and 8 more figures

Theorems & Definitions (6)

proof
proof
proof
proof
proof
proof

Origin Identification for Text-Guided Image-to-Image Diffusion Models

TL;DR

Abstract

Origin Identification for Text-Guided Image-to-Image Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (13)

Theorems & Definitions (6)