Table of Contents
Fetching ...

Can We Build a Monolithic Model for Fake Image Detection? SICA: Semantic-Induced Constrained Adaptation for Unified-Yet-Discriminative Artifact Feature Space Reconstruction

Bo Du, Xiaochen Ma, Xuekang Zhu, Zhe Yang, Chaogun Niu, Jian Liu, Ji-Zhe Zhou

TL;DR

This work finds the ``heterogeneous phenomenon'', which is the intrinsic distinctness of artifacts across subdomains, and proposes Semantic-Induced Constrained Adaptation (SICA), the first monolithic FID paradigm, and hypothesizes that high-level semantics can serve as a structural prior for the reconstruction.

Abstract

Fake Image Detection (FID), aiming at unified detection across four image forensic subdomains, is critical in real-world forensic scenarios. Compared with ensemble approaches, monolithic FID models are theoretically more promising, but to date, consistently yield inferior performance in practice. In this work, by discovering the ``heterogeneous phenomenon'', which is the intrinsic distinctness of artifacts across subdomains, we diagnose the cause of this underperformance for the first time: the collapse of the artifact feature space driven by such phenomenon. The core challenge for developing a practical monolithic FID model thus boils down to the ``unified-yet-discriminative" reconstruction of the artifact feature space. To address this paradoxical challenge, we hypothesize that high-level semantics can serve as a structural prior for the reconstruction, and further propose Semantic-Induced Constrained Adaptation (SICA), the first monolithic FID paradigm. Extensive experiments on our OpenMMSec dataset demonstrate that SICA outperforms 15 state-of-the-art methods and reconstructs the target unified-yet-discriminative artifact feature space in a near-orthogonal manner, thus firmly validating our hypothesis. The code and dataset are available at:https: //github.com/scu-zjz/SICA_OpenMMSec.

Can We Build a Monolithic Model for Fake Image Detection? SICA: Semantic-Induced Constrained Adaptation for Unified-Yet-Discriminative Artifact Feature Space Reconstruction

TL;DR

This work finds the ``heterogeneous phenomenon'', which is the intrinsic distinctness of artifacts across subdomains, and proposes Semantic-Induced Constrained Adaptation (SICA), the first monolithic FID paradigm, and hypothesizes that high-level semantics can serve as a structural prior for the reconstruction.

Abstract

Fake Image Detection (FID), aiming at unified detection across four image forensic subdomains, is critical in real-world forensic scenarios. Compared with ensemble approaches, monolithic FID models are theoretically more promising, but to date, consistently yield inferior performance in practice. In this work, by discovering the ``heterogeneous phenomenon'', which is the intrinsic distinctness of artifacts across subdomains, we diagnose the cause of this underperformance for the first time: the collapse of the artifact feature space driven by such phenomenon. The core challenge for developing a practical monolithic FID model thus boils down to the ``unified-yet-discriminative" reconstruction of the artifact feature space. To address this paradoxical challenge, we hypothesize that high-level semantics can serve as a structural prior for the reconstruction, and further propose Semantic-Induced Constrained Adaptation (SICA), the first monolithic FID paradigm. Extensive experiments on our OpenMMSec dataset demonstrate that SICA outperforms 15 state-of-the-art methods and reconstructs the target unified-yet-discriminative artifact feature space in a near-orthogonal manner, thus firmly validating our hypothesis. The code and dataset are available at:https: //github.com/scu-zjz/SICA_OpenMMSec.
Paper Structure (40 sections, 8 equations, 11 figures, 11 tables)

This paper contains 40 sections, 8 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: t-SNE visualization of semantic feature space and SICA (ours) reconstructed space. SICA leverages semantics to reconstruct unified-yet-discriminative artifact feature space.
  • Figure 2: Faking type overview of OpenMMSec. Zoom in for better visualization of faking types.
  • Figure 3: Examples in OpenMMSec.
  • Figure 4: Weight adaptation illustration of FFT, Effort, and the proposed SICA. (a) Fully Finetune (FFT) updates the entire parameter space, risking semantic overfitting. (b) Efforteffort explicitly decomposes weights via SVD into principal components (semantics) and residual components (artifacts), updating only the latter, yielding a rigid and suboptimal inductive bias. (c) Our SICA freezes the pre-trained weights and introduces a low-rank update to co-adapt semantics and artifacts.
  • Figure 5: Spectral analysis of left and right subspace via SVD. SICA exhibits higher outside energy ratio and lower cosine similarity with respect to the dominant semantic subspace, proving to learn less semantics, thereby reducing the risk of semantic overfitting and enabling better artifact learning. Please refer to the main text for a more detailed description.
  • ...and 6 more figures