Table of Contents
Fetching ...

Exploring the Adversarial Robustness of Face Forgery Detection with Decision-based Black-box Attacks

Zhaoyu Chen, Bo Li, Kaixun Jiang, Shuang Wu, Shouhong Ding, Wenqiang Zhang

TL;DR

Face forgery detectors remain vulnerable to decision-based adversarial examples under label-only access. The authors introduce Cross-task Perturbation Initialization (CPI) and a Frequency Decision-based Attack (FDA) to improve attack efficacy while preserving image quality, operating under the constraint $||\delta||_\infty \le \epsilon$. Evaluations on FaceForensics++ and CelebDF, including industrial APIs, demonstrate state-of-the-art attack performance and strong transferability, revealing practical security gaps in both spatial- and frequency-domain detectors. The work underscores the risk of concurrent attacks on face recognition and forgery detection and points to the need for robust defenses against frequency-domain perturbations in real-world deployments.

Abstract

Face forgery generation technologies generate vivid faces, which have raised public concerns about security and privacy. Many intelligent systems, such as electronic payment and identity verification, rely on face forgery detection. Although face forgery detection has successfully distinguished fake faces, recent studies have demonstrated that face forgery detectors are very vulnerable to adversarial examples. Meanwhile, existing attacks rely on network architectures or training datasets instead of the predicted labels, which leads to a gap in attacking deployed applications. To narrow this gap, we first explore the decision-based attacks on face forgery detection. We identify challenges in directly applying existing decision-based attacks, such as perturbation initialization failure and reduced image quality. To overcome these issues, we propose cross-task perturbation to handle initialization failures by utilizing the high correlation of face features on different tasks. Additionally, inspired by the use of frequency cues in face forgery detection, we introduce the frequency decision-based attack. This attack involves adding perturbations in the frequency domain while constraining visual quality in the spatial domain. Finally, extensive experiments demonstrate that our method achieves state-of-the-art attack performance on FaceForensics++, CelebDF, and industrial APIs, with high query efficiency and guaranteed image quality. Further, the fake faces by our method can pass face forgery detection and face recognition, which exposes the security problems of face forgery detectors.

Exploring the Adversarial Robustness of Face Forgery Detection with Decision-based Black-box Attacks

TL;DR

Face forgery detectors remain vulnerable to decision-based adversarial examples under label-only access. The authors introduce Cross-task Perturbation Initialization (CPI) and a Frequency Decision-based Attack (FDA) to improve attack efficacy while preserving image quality, operating under the constraint . Evaluations on FaceForensics++ and CelebDF, including industrial APIs, demonstrate state-of-the-art attack performance and strong transferability, revealing practical security gaps in both spatial- and frequency-domain detectors. The work underscores the risk of concurrent attacks on face recognition and forgery detection and points to the need for robust defenses against frequency-domain perturbations in real-world deployments.

Abstract

Face forgery generation technologies generate vivid faces, which have raised public concerns about security and privacy. Many intelligent systems, such as electronic payment and identity verification, rely on face forgery detection. Although face forgery detection has successfully distinguished fake faces, recent studies have demonstrated that face forgery detectors are very vulnerable to adversarial examples. Meanwhile, existing attacks rely on network architectures or training datasets instead of the predicted labels, which leads to a gap in attacking deployed applications. To narrow this gap, we first explore the decision-based attacks on face forgery detection. We identify challenges in directly applying existing decision-based attacks, such as perturbation initialization failure and reduced image quality. To overcome these issues, we propose cross-task perturbation to handle initialization failures by utilizing the high correlation of face features on different tasks. Additionally, inspired by the use of frequency cues in face forgery detection, we introduce the frequency decision-based attack. This attack involves adding perturbations in the frequency domain while constraining visual quality in the spatial domain. Finally, extensive experiments demonstrate that our method achieves state-of-the-art attack performance on FaceForensics++, CelebDF, and industrial APIs, with high query efficiency and guaranteed image quality. Further, the fake faces by our method can pass face forgery detection and face recognition, which exposes the security problems of face forgery detectors.
Paper Structure (19 sections, 5 equations, 6 figures, 12 tables, 3 algorithms)

This paper contains 19 sections, 5 equations, 6 figures, 12 tables, 3 algorithms.

Figures (6)

  • Figure 1: When directly using existing decision-based attacks on face forgery detection, it is prone to (a) perturbation initialization failure and (b) low image quality, which affects attack performance and limits the application of the face (i.e. simultaneously attacking face recognition and face forgery detection). ✔ represents recognition as the original identity or detection as real faces.
  • Figure 2: Empirical analysis of cosine similarity (%) on intermediate features between forgery detectors and face recognition models under clean examples and adversarial examples (FGSM fgsm, PGD pgd, and MIM mim). Here, the face recognition models are FaceNet FaceNet, CosFace CosFace, and ArcFace ArcFace. The face forgery detection models are ResNet50 resnet, Xception xception, EfficientNet-b4 (Eb4) EfficientNet. We find that face recognition models and face forgery detectors have a high correlation between the intermediate features of the same face. It means that the perturbation of attacking face recognition can also perturb face forgery detection to a certain extent.
  • Figure 3: (a) The overview of cross-task perturbation initialization. With the high correlation of intermediate features between face-related tasks, we iteratively update the real face, keeping it adversarial while improving efficiency for subsequent attacks. (b) The pipeline of the frequency decision-based attack. We first obtain the initial perturbation by perturbation initialization. Then, we iterate in frequency noise projection and random perturbation flip until $c(x_{adv})=0$ and $||\delta||_\infty \leq \epsilon$.
  • Figure 4: Ablation study on intermediate feature layers. We find that middle-layer features tend to have better attack performance. In addition, FaceNet has a better initialization effect than CPI's face recognition model.
  • Figure 5: Visual qualitative evaluation on SignFlip, RayS, and ours on FaceForensics++. From top to bottom, the forged faces are from Deepfake, Face2Face, FaceSwap, and NeuralTextures.
  • ...and 1 more figures