Detecting Origin Attribution for Text-to-Image Diffusion Models

Katherine Xu; Lingzhi Zhang; Jianbo Shi

Detecting Origin Attribution for Text-to-Image Diffusion Models

Katherine Xu, Lingzhi Zhang, Jianbo Shi

TL;DR

This work studies origin attribution for modern text-to-image diffusion models by training image attributors to identify the source generator among 12 contemporary T2I models (plus real images) and by probing how inference-time hyperparameters, post-editing, and visual granularity affect attribution. A dataset of nearly 450K images from diverse generators and prompts is built, and RGB-based attributors achieve over $90\%$ accuracy on a $13$-way task, indicating strong generator fingerprints. The study reveals that initialization seeds are almost perfectly detectable (near $99\%$), with other hyperparameters also leaving detectable traces, while post-editing degrades but does not erase attribution capability. Beyond RGB, style-based (Gram matrix) and mid-level representations yield robust signals (e.g., $92.80\%$ accuracy for style features), suggesting that texture, structure, and layout encode generator fingerprints. These findings advance fake-image forensics and copyright protection, and the authors provide a framework to extend attribution to new open-source and proprietary generators.

Abstract

Modern text-to-image (T2I) diffusion models can generate images with remarkable realism and creativity. These advancements have sparked research in fake image detection and attribution, yet prior studies have not fully explored the practical and scientific dimensions of this task. In addition to attributing images to 12 state-of-the-art T2I generators, we provide extensive analyses on what inference stage hyperparameters and image modifications are discernible. Our experiments reveal that initialization seeds are highly detectable, along with other subtle variations in the image generation process to some extent. We further investigate what visual traces are leveraged in image attribution by perturbing high-frequency details and employing mid-level representations of image style and structure. Notably, altering high-frequency information causes only slight reductions in accuracy, and training an attributor on style representations outperforms training on RGB images. Our analyses underscore that fake images are detectable and attributable at various levels of visual granularity.

Detecting Origin Attribution for Text-to-Image Diffusion Models

TL;DR

accuracy on a

-way task, indicating strong generator fingerprints. The study reveals that initialization seeds are almost perfectly detectable (near

), with other hyperparameters also leaving detectable traces, while post-editing degrades but does not erase attribution capability. Beyond RGB, style-based (Gram matrix) and mid-level representations yield robust signals (e.g.,

accuracy for style features), suggesting that texture, structure, and layout encode generator fingerprints. These findings advance fake-image forensics and copyright protection, and the authors provide a framework to extend attribution to new open-source and proprietary generators.

Abstract

Paper Structure (22 sections, 21 figures, 7 tables)

This paper contains 22 sections, 21 figures, 7 tables.

Introduction
Related Work
Dataset Generation
Images from Diverse Generators and Prompts
Images from Varying Hyperparameters During Inference Stage
Detecting Image Attribution in RGB
Training Image Attributors
Detectability of Hyperparameter Variations
Detectability of Post-Editing Enhancements
Detecting Image Attribution Beyond RGB
Conclusion
Human Performance
Data and Implementation Details
Additional Experiments
Color Analysis
...and 7 more sections

Figures (21)

Figure 1: A depiction of images generated for our dataset, showcasing two types of prompts: MS-COCO lin2014microsoft derived captions on the left, and creative prompts generated by GPT-4 on the right. For both categories, images were produced using 12 different T2I generators.
Figure 2: An illustration showcasing the diversity in generated images influenced by varying hyperparameters: different model checkpoints (within the same architecture), diverse scheduling algorithms, varied initialization seeds, and a range of inference steps.
Figure 3: Left/Middle: Accuracy and confusion matrix of EfficientFormer trained with text prompts, which achieved the highest accuracy in Table \ref{['tab:image_attributor_metric']}. Right: Accuracy of EfficientFormer as we vary the number of training images.
Figure 4: Confusion matrices for hyperparameter variations, including Stable Diffusion version, scheduler, and number of inference steps. We observe that images generated with fewer SDXL sampling steps are more detectable, likely due to visible degradation in image quality.
Figure 5: Left: Original image generated by Midjourney 6. Middle: Local modifications utilizing SDXL inpainting and Photoshop Generative Fill across three masks with small, medium, and large holes. Right: The image upscaled 4X by Magnific AI.
...and 16 more figures

Detecting Origin Attribution for Text-to-Image Diffusion Models

TL;DR

Abstract

Detecting Origin Attribution for Text-to-Image Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (21)