Training-free Source Attribution of AI-generated Images via Resynthesis
Pietro Bongini, Valentina Molinari, Andrea Costanzo, Benedetta Tondi, Mauro Barni
TL;DR
The paper tackles synthetic image source attribution under data scarcity by introducing a training-free, one-shot approach based on image resynthesis. By describing the image content with a caption, resynthesizing it across candidate generators, and measuring distance in CLIP feature space, the method attributes the image to the most similar resynthesis, formalized as $j^* = \arg\min_j d\big(E(s_j(p(I))), E(I)\big)$. A new dataset with 14 generators (including 7 commercial) and resyntheses is proposed to benchmark few-shot and zero-shot SIA, enabling direct comparison with baselines like CLIP-based classifiers, De-Fake, CLIP-LoRA, EfficientNetB4, and Tiny Autoencoders. Results show the resynthesis method excels in low-shot regimes (1–10 shots), while standard fine-tuned approaches dominate with more data, highlighting a valuable trade-off for practical attribution scenarios. The dataset and findings offer a principled benchmark for developing robust, data-efficient SIA methods and motivate exploring richer distance metrics and alternate secondary-description strategies.
Abstract
Synthetic image source attribution is a challenging task, especially in data scarcity conditions requiring few-shot or zero-shot classification capabilities. We present a new training-free one-shot attribution method based on image resynthesis. A prompt describing the image under analysis is generated, then it is used to resynthesize the image with all the candidate sources. The image is attributed to the model which produced the resynthesis closest to the original image in a proper feature space. We also introduce a new dataset for synthetic image attribution consisting of face images from commercial and open-source text-to-image generators. The dataset provides a challenging attribution framework, useful for developing new attribution models and testing their capabilities on different generative architectures. The dataset structure allows to test approaches based on resynthesis and to compare them to few-shot methods. Results from state-of-the-art few-shot approaches and other baselines show that the proposed resynthesis method outperforms existing techniques when only a few samples are available for training or fine-tuning. The experiments also demonstrate that the new dataset is a challenging one and represents a valuable benchmark for developing and evaluating future few-shot and zero-shot methods.
