Universal Anti-forensics Attack against Image Forgery Detection via Multi-modal Guidance
Haipeng Li, Rongxuan Peng, Anwei Luo, Shunquan Tan, Changsheng Chen, Anastasia Antsiferova
TL;DR
This work identifies a systemic vulnerability in AIGC forensics arising from widespread use of shared upstream backbones such as CLIP, which enables universal anti-forensics attacks. It introduces ForgeryEraser, a universal framework that uses a multi-modal guidance loss $L_{MMG}$ to steer forged image embeddings toward text-derived authentic anchors while repelling forgery anchors, operating with a source-aware strategy for global synthesis versus local editing. The method achieves substantial degradation across six detectors on both global synthesis and local editing benchmarks, with robust performance under common distortions and even the ability to influence detectors' explanations to align with authenticity. The results highlight the need to rethink reliance on upstream semantic representations and to develop defenses resilient to semantic-level manipulation and interpretability threats in digital media forensics.
Abstract
The rapid advancement of AI-Generated Content (AIGC) technologies poses significant challenges for authenticity assessment. However, existing evaluation protocols largely overlook anti-forensics attack, failing to ensure the comprehensive robustness of state-of-the-art AIGC detectors in real-world applications. To bridge this gap, we propose ForgeryEraser, a framework designed to execute universal anti-forensics attack without access to the target AIGC detectors. We reveal an adversarial vulnerability stemming from the systemic reliance on Vision-Language Models (VLMs) as shared backbones (e.g., CLIP), where downstream AIGC detectors inherit the feature space of these publicly accessible models. Instead of traditional logit-based optimization, we design a multi-modal guidance loss to drive forged image embeddings within the VLM feature space toward text-derived authentic anchors to erase forgery traces, while repelling them from forgery anchors. Extensive experiments demonstrate that ForgeryEraser causes substantial performance degradation to advanced AIGC detectors on both global synthesis and local editing benchmarks. Moreover, ForgeryEraser induces explainable forensic models to generate explanations consistent with authentic images for forged images. Our code will be made publicly available.
