PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images

Simon Damm; Jonas Ricker; Henning Petzka; Asja Fischer

PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images

Simon Damm, Jonas Ricker, Henning Petzka, Asja Fischer

TL;DR

PRADA leverages the explicit likelihoods from autoregressive image generators to detect and attribute AR-generated images. By learning a small, model-specific score function that balances conditional and unconditional token probabilities, and by aggregating token-wise scores across scales, PRADA achieves state-of-the-art-like detection on powerful text-to-image models and competitive results on class-to-image models. The method offers high interpretability, requires limited training data, and adapts to new AR generators without retraining a large classifier. Its practical impact lies in providing a lightweight, reliable forensic tool for provenance and authenticity in the era of high-fidelity AR imagery.

Abstract

Autoregressive (AR) image generation has recently emerged as a powerful paradigm for image synthesis. Leveraging the generation principle of large language models, they allow for efficiently generating deceptively real-looking images, further increasing the need for reliable detection methods. However, to date there is a lack of work specifically targeting the detection of images generated by AR image generators. In this work, we present PRADA (Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images), a simple and interpretable approach that can reliably detect AR-generated images and attribute them to their respective source model. The key idea is to inspect the ratio of a model's conditional and unconditional probability for the autoregressive token sequence representing a given image. Whenever an image is generated by a particular model, its probability ratio shows unique characteristics which are not present for images generated by other models or real images. We exploit these characteristics for threshold-based attribution and detection by calibrating a simple, model-specific score function. Our experimental evaluation shows that PRADA is highly effective against eight class-to-image and four text-to-image models.

PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images

TL;DR

Abstract

PRADA: Probability-Ratio-Based Attribution and Detection of Autoregressive-Generated Images

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (17)