Table of Contents
Fetching ...

Any-Resolution AI-Generated Image Detection by Spectral Learning

Dimitrios Karageorgiou, Symeon Papadopoulos, Ioannis Kompatsiaris, Efstratios Gavves

TL;DR

This work addresses generalizable AI-generated image detection by modeling the spectral distribution of real images through masked frequency reconstruction in a self-supervised setup. It detects generated images as out-of-distribution via spectral reconstruction similarity and introduces spectral context mechanisms (SCV and SCA) to preserve and exploit spectral information at any resolution. The approach shows a 5.5 percentage-point improvement in AUC over state-of-the-art across 13 generators and demonstrates robustness to common online perturbations, highlighting strong generalization and practical applicability. By decoupling detection from individual generator artifacts and leveraging invariant spectral patterns, SPAI offers a scalable, real-world solution for AI-generated image forensics with broad impact for content authenticity and safety.

Abstract

Recent works have established that AI models introduce spectral artifacts into generated images and propose approaches for learning to capture them using labeled data. However, the significant differences in such artifacts among different generative models hinder these approaches from generalizing to generators not seen during training. In this work, we build upon the key idea that the spectral distribution of real images constitutes both an invariant and highly discriminative pattern for AI-generated image detection. To model this under a self-supervised setup, we employ masked spectral learning using the pretext task of frequency reconstruction. Since generated images constitute out-of-distribution samples for this model, we propose spectral reconstruction similarity to capture this divergence. Moreover, we introduce spectral context attention, which enables our approach to efficiently capture subtle spectral inconsistencies in images of any resolution. Our spectral AI-generated image detection approach (SPAI) achieves a 5.5% absolute improvement in AUC over the previous state-of-the-art across 13 recent generative approaches, while exhibiting robustness against common online perturbations. Code is available on https://mever-team.github.io/spai.

Any-Resolution AI-Generated Image Detection by Spectral Learning

TL;DR

This work addresses generalizable AI-generated image detection by modeling the spectral distribution of real images through masked frequency reconstruction in a self-supervised setup. It detects generated images as out-of-distribution via spectral reconstruction similarity and introduces spectral context mechanisms (SCV and SCA) to preserve and exploit spectral information at any resolution. The approach shows a 5.5 percentage-point improvement in AUC over state-of-the-art across 13 generators and demonstrates robustness to common online perturbations, highlighting strong generalization and practical applicability. By decoupling detection from individual generator artifacts and leveraging invariant spectral patterns, SPAI offers a scalable, real-world solution for AI-generated image forensics with broad impact for content authenticity and safety.

Abstract

Recent works have established that AI models introduce spectral artifacts into generated images and propose approaches for learning to capture them using labeled data. However, the significant differences in such artifacts among different generative models hinder these approaches from generalizing to generators not seen during training. In this work, we build upon the key idea that the spectral distribution of real images constitutes both an invariant and highly discriminative pattern for AI-generated image detection. To model this under a self-supervised setup, we employ masked spectral learning using the pretext task of frequency reconstruction. Since generated images constitute out-of-distribution samples for this model, we propose spectral reconstruction similarity to capture this divergence. Moreover, we introduce spectral context attention, which enables our approach to efficiently capture subtle spectral inconsistencies in images of any resolution. Our spectral AI-generated image detection approach (SPAI) achieves a 5.5% absolute improvement in AUC over the previous state-of-the-art across 13 recent generative approaches, while exhibiting robustness against common online perturbations. Code is available on https://mever-team.github.io/spai.

Paper Structure

This paper contains 24 sections, 10 equations, 25 figures, 9 tables.

Figures (25)

  • Figure 1: SPAI employs spectral learning to learn the spectral distribution of real images under a self-supervised setup. Then, using the spectral reconstruction similarity it detects AI-generated images as out-of-distribution samples of this learned model.
  • Figure 2: Overview of the SPAI approach. We learn a model of the spectral distribution of real images under a self-supervised setup using masked spectral learning. Then, we use the spectral reconstruction similarity to measure the divergence from this learned distribution and detect AI-generated images as out-of-distribution samples of this model. Spectral context vector captures the spectral context under which the spectral reconstruction similarity values are computed, while spectral context attention enables the processing of any-resolution images for capturing subtle spectral inconsistencies.
  • Figure 3: Robustness evaluation on common perturbations. Average AUC is presented over the perturbed versions of two sources of real images from smartphones and DSLR cameras respectively and 13 generative models.
  • Figure 4: Qualitative analysis of spectral context attention. A cool-warm overlay has been applied on each patch. Red color indicates significant patches for deciding whether the image is AI-generated (high attention values), while blue color indicates irrelevant patches (low attention values). The attention values have been normalized in $[0, 1]$.
  • Figure 5: Failures in detecting derivative AI-generated images. Similar to \ref{['fig:sca_qualitative']}, spectral context attention is depicted in overlay.
  • ...and 20 more figures