Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection

Caiyun Xie; Dengpan Ye; Yunming Zhang; Long Tang; Yunna Lv; Jiacheng Deng; Jiawei Song

Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection

Caiyun Xie, Dengpan Ye, Yunming Zhang, Long Tang, Yunna Lv, Jiacheng Deng, Jiawei Song

TL;DR

The paper tackles the vulnerability of GAN- and diffusion-based AIGC detectors to adversarial attacks in real-world settings. It introduces R$^2$BA, a realistic-like robust black-box attack that fuses Gaussian blur, JPEG compression, Gaussian noise, and light spots, optimized via stochastic PSO with inertia decay to cross the detector boundary at a fake probability of $0.5$ while preserving image quality. The authors demonstrate substantial improvements in anti-detection performance (up to 38–41% ASR gains) and image invisibility (BRISQUE/SSIM) across multiple detectors and datasets, including a commercial API. The work highlights the practical security risks in AIGC detection and provides a benchmark for evaluating detector robustness under realistic post-processing conditions.

Abstract

The security of AI-generated content (AIGC) detection is crucial for ensuring multimedia content credibility. To enhance detector security, research on adversarial attacks has become essential. However, most existing adversarial attacks focus only on GAN-generated facial images detection, struggle to be effective on multi-class natural images and diffusion-based detectors, and exhibit poor invisibility. To fill this gap, we first conduct an in-depth analysis of the vulnerability of AIGC detectors and discover the feature that detectors vary in vulnerability to different post-processing. Then, considering that the detector is agnostic in real-world scenarios and given this discovery, we propose a Realistic-like Robust Black-box Adversarial attack (R$^2$BA) with post-processing fusion optimization. Unlike typical perturbations, R$^2$BA uses real-world post-processing, i.e., Gaussian blur, JPEG compression, Gaussian noise and light spot to generate adversarial examples. Specifically, we use a stochastic particle swarm algorithm with inertia decay to optimize post-processing fusion intensity and explore the detector's decision boundary. Guided by the detector's fake probability, R$^2$BA enhances/weakens the detector-vulnerable/detector-robust post-processing intensity to strike a balance between adversariality and invisibility. Extensive experiments on popular/commercial AIGC detectors and datasets demonstrate that R$^2$BA exhibits impressive anti-detection performance, excellent invisibility, and strong robustness in GAN-based and diffusion-based cases. Compared to state-of-the-art white-box and black-box attacks, R$^2$BA shows significant improvements of 15\%--72\% and 21\%--47\% in anti-detection performance under the original and robust scenario respectively, offering valuable insights for the security of AIGC detection in real-world applications.

Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection

TL;DR

The paper tackles the vulnerability of GAN- and diffusion-based AIGC detectors to adversarial attacks in real-world settings. It introduces R

BA, a realistic-like robust black-box attack that fuses Gaussian blur, JPEG compression, Gaussian noise, and light spots, optimized via stochastic PSO with inertia decay to cross the detector boundary at a fake probability of

while preserving image quality. The authors demonstrate substantial improvements in anti-detection performance (up to 38–41% ASR gains) and image invisibility (BRISQUE/SSIM) across multiple detectors and datasets, including a commercial API. The work highlights the practical security risks in AIGC detection and provides a benchmark for evaluating detector robustness under realistic post-processing conditions.

Abstract

BA) with post-processing fusion optimization. Unlike typical perturbations, R

BA uses real-world post-processing, i.e., Gaussian blur, JPEG compression, Gaussian noise and light spot to generate adversarial examples. Specifically, we use a stochastic particle swarm algorithm with inertia decay to optimize post-processing fusion intensity and explore the detector's decision boundary. Guided by the detector's fake probability, R

BA enhances/weakens the detector-vulnerable/detector-robust post-processing intensity to strike a balance between adversariality and invisibility. Extensive experiments on popular/commercial AIGC detectors and datasets demonstrate that R

BA exhibits impressive anti-detection performance, excellent invisibility, and strong robustness in GAN-based and diffusion-based cases. Compared to state-of-the-art white-box and black-box attacks, R

BA shows significant improvements of 15\%--72\% and 21\%--47\% in anti-detection performance under the original and robust scenario respectively, offering valuable insights for the security of AIGC detection in real-world applications.

Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection

TL;DR

Abstract

Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)