Table of Contents
Fetching ...

Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection

Jiawei Liang, Siyuan Liang, Aishan Liu, Xiaojun Jia, Junhao Kuang, Xiaochun Cao

TL;DR

The paper exposes a backdoor threat in face forgery detection, proposing the Poisoned Forgery Face framework that uses translation-sensitive triggers and landmark-guided embedding to poison detectors via clean-label poisoning. It introduces a scalable trigger generator and a fast, stealthy embedding scheme to maximize trigger discrepancy under transformations, enabling effective backdoors against both deepfake and blending detectors. Extensive experiments across FF++, Celeb-DF-2, and DFD demonstrate substantial BD-AUC gains, strong cross-dataset transfer, and resilience to defenses, while preserving benign detection performance. The work highlights an important security risk and provides a foundation for developing robust defenses against backdoor attacks in face forgery detection.

Abstract

The proliferation of face forgery techniques has raised significant concerns within society, thereby motivating the development of face forgery detection methods. These methods aim to distinguish forged faces from genuine ones and have proven effective in practical applications. However, this paper introduces a novel and previously unrecognized threat in face forgery detection scenarios caused by backdoor attack. By embedding backdoors into models and incorporating specific trigger patterns into the input, attackers can deceive detectors into producing erroneous predictions for forged faces. To achieve this goal, this paper proposes \emph{Poisoned Forgery Face} framework, which enables clean-label backdoor attacks on face forgery detectors. Our approach involves constructing a scalable trigger generator and utilizing a novel convolving process to generate translation-sensitive trigger patterns. Moreover, we employ a relative embedding method based on landmark-based regions to enhance the stealthiness of the poisoned samples. Consequently, detectors trained on our poisoned samples are embedded with backdoors. Notably, our approach surpasses SoTA backdoor baselines with a significant improvement in attack success rate (+16.39\% BD-AUC) and reduction in visibility (-12.65\% $L_\infty$). Furthermore, our attack exhibits promising performance against backdoor defenses. We anticipate that this paper will draw greater attention to the potential threats posed by backdoor attacks in face forgery detection scenarios. Our codes will be made available at \url{https://github.com/JWLiang007/PFF}

Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection

TL;DR

The paper exposes a backdoor threat in face forgery detection, proposing the Poisoned Forgery Face framework that uses translation-sensitive triggers and landmark-guided embedding to poison detectors via clean-label poisoning. It introduces a scalable trigger generator and a fast, stealthy embedding scheme to maximize trigger discrepancy under transformations, enabling effective backdoors against both deepfake and blending detectors. Extensive experiments across FF++, Celeb-DF-2, and DFD demonstrate substantial BD-AUC gains, strong cross-dataset transfer, and resilience to defenses, while preserving benign detection performance. The work highlights an important security risk and provides a foundation for developing robust defenses against backdoor attacks in face forgery detection.

Abstract

The proliferation of face forgery techniques has raised significant concerns within society, thereby motivating the development of face forgery detection methods. These methods aim to distinguish forged faces from genuine ones and have proven effective in practical applications. However, this paper introduces a novel and previously unrecognized threat in face forgery detection scenarios caused by backdoor attack. By embedding backdoors into models and incorporating specific trigger patterns into the input, attackers can deceive detectors into producing erroneous predictions for forged faces. To achieve this goal, this paper proposes \emph{Poisoned Forgery Face} framework, which enables clean-label backdoor attacks on face forgery detectors. Our approach involves constructing a scalable trigger generator and utilizing a novel convolving process to generate translation-sensitive trigger patterns. Moreover, we employ a relative embedding method based on landmark-based regions to enhance the stealthiness of the poisoned samples. Consequently, detectors trained on our poisoned samples are embedded with backdoors. Notably, our approach surpasses SoTA backdoor baselines with a significant improvement in attack success rate (+16.39\% BD-AUC) and reduction in visibility (-12.65\% ). Furthermore, our attack exhibits promising performance against backdoor defenses. We anticipate that this paper will draw greater attention to the potential threats posed by backdoor attacks in face forgery detection scenarios. Our codes will be made available at \url{https://github.com/JWLiang007/PFF}
Paper Structure (18 sections, 17 equations, 4 figures, 7 tables)

This paper contains 18 sections, 17 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: This paper reveals a potential hazard in face forgery detection, where an attacker can embed a backdoor into a face forgery detector by maliciously manipulating samples in the training dataset. Consequently, the attacker can deceive the infected detector to make real predictions on fake images using the specific backdoor trigger.
  • Figure 2: The pipeline of our proposed Poisoned Forgery Faces backdoor attack framework.
  • Figure 3: Visualization of poisoned samples generated using different backdoor attack methods.
  • Figure A.1: The network architecture of the generator.