When Detectors Forget Forensics: Blocking Semantic Shortcuts for Generalizable AI-Generated Image Detection

Chao Shuai; Zhenguang Liu; Shaojing Fan; Bin Gong; Weichen Lian; Xiuli Bi; Zhongjie Ba; Kui Ren

When Detectors Forget Forensics: Blocking Semantic Shortcuts for Generalizable AI-Generated Image Detection

Chao Shuai, Zhenguang Liu, Shaojing Fan, Bin Gong, Weichen Lian, Xiuli Bi, Zhongjie Ba, Kui Ren

TL;DR

Geometric Semantic Decoupling is proposed, a parameter-free module that explicitly removes semantic components from learned representations by leveraging a frozen VFM as a semantic guide with a trainable VFM as an artifact detector, forcing the artifact detector to rely on semantic-invariant forensic evidence.

Abstract

AI-generated image detection has become increasingly important with the rapid advancement of generative AI. However, detectors built on Vision Foundation Models (VFMs, \emph{e.g.}, CLIP) often struggle to generalize to images created using unseen generation pipelines. We identify, for the first time, a key failure mechanism, termed \emph{semantic fallback}, where VFM-based detectors rely on dominant pre-trained semantic priors (such as identity) rather than forgery-specific traces under distribution shifts. To address this issue, we propose \textbf{Geometric Semantic Decoupling (GSD)}, a parameter-free module that explicitly removes semantic components from learned representations by leveraging a frozen VFM as a semantic guide with a trainable VFM as an artifact detector. GSD estimates semantic directions from batch-wise statistics and projects them out via a geometric constraint, forcing the artifact detector to rely on semantic-invariant forensic evidence. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art approaches, achieving 94.4\% video-level AUC (+\textbf{1.2\%}) in cross-dataset evaluation, improving robustness to unseen manipulations (+\textbf{3.0\%} on DF40), and generalizing beyond faces to the detection of synthetic images of general scenes, including UniversalFakeDetect (+\textbf{0.9\%}) and GenImage (+\textbf{1.7\%}).

When Detectors Forget Forensics: Blocking Semantic Shortcuts for Generalizable AI-Generated Image Detection

TL;DR

Abstract

Paper Structure (34 sections, 6 equations, 13 figures, 9 tables)

This paper contains 34 sections, 6 equations, 13 figures, 9 tables.

Introduction
Related work
Methodology
Dynamic Semantic Basis Construction
Geometric Semantic Decoupling
Training Objective
Experiments
Implementation Details
Face Forgery Detection
Synthetic Image Detection
Feature Analysis
Ablation Study
Conclusions
Implementation Details
Datasets and Evaluation Details
...and 19 more sections

Figures (13)

Figure 1: t-SNE tsne visualization of feature distributions extracted by the fine-tuned CLIP encoder on in-domain (FaceForensics++) and cross-domain (CelebDF-v2) datasets. Points are colored by forgery labels (a, c) and face identities (b, d); only 20 identities for clarity. On FaceForensics++ (a, b), real samples form tight identity-centric clusters, while fake samples exhibit clear identity-separated clusters, suggesting that learned forgery artifacts act as a repulsive forensic signal. When transferring to CelebDF-v2 (c, d), this semantic fallback becomes pervasive: due to the poor cross-domain transferability of learned forensic cues, a substantial fraction of fake samples almost re-aggregate by identity (e.g., green dashed circles), leading to increased overlap with real samples and reduced real/fake separability. Hard-to-separate samples (e.g., red dashed circles) likewise concentrate within cohesive identity clusters.
Figure 2: Analysis of Semantic Consistency. The distribution of cosine similarities between random samples and the global semantic anchor.
Figure 3: t-SNE visualization of features extracted by the fine-tuned CLIP augmented with the Geometric Semantic Decoupling (GSD) module. Points are colored by forgery labels (a) and face identities (b). Notably, the features exhibit a clear real/fake separation and preserves pronounced identity-separated clusters, indicating that the model primarily relies on forgery-specific features.
Figure 4: Overview of the proposed Geometric Semantic Decoupling (GSD) framework. GSD adopts an asymmetric dual-stream architecture consisting of a frozen semantic basis extractor (bottom) and a trainable artifact detector (top). Unlike prior parameter-efficient adaptations, we estimate a dynamic semantic basis$\boldsymbol{U}$ directly from batch-wise statistics via Householder-based QR decomposition, where $\operatorname{span}(\boldsymbol{U})$ characterizes the dominant semantic manifold encoded by the frozen backbone. The GSD module then enforces an explicit geometric constraint by projecting learnable intermediate detector features $\boldsymbol{F}$ onto the orthogonal complement of the estimated semantic subspace. This parameter-free semantic subtraction removes the semantic component $\boldsymbol{F}^{\parallel}$ and yields de-semanticized features $\boldsymbol{F}'$, compelling the detector to rely solely on generalizable forensic artifacts rather than semantic shortcuts.
Figure 5: Visualization of self-attention maps. Pretrained and naively fine-tuned CLIP exhibit attention collapse with sparse hotspot patterns, and the fine-tuned model produces attention maps that are nearly identical to the pretrained one, suggesting a semantic fallback to strong foundation priors. In contrast, integrating GSD suppresses the dominance of semantic regions and shifts attention toward forensic-relevant cues: for real images, attention concentrates on blending edges and texture-rich regions, while for face-forgery images, it highlights manipulated regions; for synthetic images, attention becomes markedly less localized and spreads across the image.
...and 8 more figures

When Detectors Forget Forensics: Blocking Semantic Shortcuts for Generalizable AI-Generated Image Detection

TL;DR

Abstract

When Detectors Forget Forensics: Blocking Semantic Shortcuts for Generalizable AI-Generated Image Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (13)