Transferable Dual-Domain Feature Importance Attack against AI-Generated Image Detector
Weiheng Zhu, Gang Cao, Jing Liu, Lifang Yu, Shaowei Weng
TL;DR
AI-generated image detectors face vulnerability to adversarial manipulation, and evaluating cross-model transferability is essential for security. DuFIA introduces a dual-domain feature importance attack that jointly exploits spatial and frequency perturbations to guide a mid-layer feature loss, improving cross-detector transferability without excessive perceptual distortion. Through extensive experiments on a wide range of detectors and generators, DuFIA achieves superior transferability and robustness to common post-processing, outperforming state-of-the-art attacks. The work provides a practical framework for antiforensics evaluation and offers code for reproducibility.
Abstract
Recent AI-generated image (AIGI) detectors achieve impressive accuracy under clean condition. In view of antiforensics, it is significant to develop advanced adversarial attacks for evaluating the security of such detectors, which remains unexplored sufficiently. This letter proposes a Dual-domain Feature Importance Attack (DuFIA) scheme to invalidate AIGI detectors to some extent. Forensically important features are captured by the spatially interpolated gradient and frequency-aware perturbation. The adversarial transferability is enhanced by jointly modeling spatial and frequency-domain feature importances, which are fused to guide the optimization-based adversarial example generation. Extensive experiments across various AIGI detectors verify the cross-model transferability, transparency and robustness of DuFIA.
