Can We Leave Deepfake Data Behind in Training Deepfake Detector?

Jikang Cheng; Zhiyuan Yan; Ying Zhang; Yuhao Luo; Zhongyuan Wang; Chen Li

Can We Leave Deepfake Data Behind in Training Deepfake Detector?

Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang, Chen Li

TL;DR

This work investigates whether deepfake data must be included in training deepfake detectors, arguing that poor latent-space organization in vanilla hybrid training hinders cross-dataset generalization. It introduces the Oriented Progressive Regularizer (OPR) to organize four aligned anchors—real, SBI, CBI, and deepfake—into a progressive latent-space structure, complemented by feature bridging to simulate a continuous transition and a transition loss to reinforce progression. Empirical results on FF++ and several external datasets show improved cross-dataset AUC and robustness, with ablations validating the necessity of each component and the effectiveness of a triplet-binary attribute strategy. The approach yields stronger generalization by leveraging both blendfake and deepfake information, and introduces metrics like $mPD$ to quantify latent-space regularity.

Abstract

The generalization ability of deepfake detectors is vital for their applications in real-world scenarios. One effective solution to enhance this ability is to train the models with manually-blended data, which we termed "blendfake", encouraging models to learn generic forgery artifacts like blending boundary. Interestingly, current SoTA methods utilize blendfake without incorporating any deepfake data in their training process. This is likely because previous empirical observations suggest that vanilla hybrid training (VHT), which combines deepfake and blendfake data, results in inferior performance to methods using only blendfake data (so-called "1+1<2"). Therefore, a critical question arises: Can we leave deepfake behind and rely solely on blendfake data to train an effective deepfake detector? Intuitively, as deepfakes also contain additional informative forgery clues (e.g., deep generative artifacts), excluding all deepfake data in training deepfake detectors seems counter-intuitive. In this paper, we rethink the role of blendfake in detecting deepfakes and formulate the process from "real to blendfake to deepfake" to be a progressive transition. Specifically, blendfake and deepfake can be explicitly delineated as the oriented pivot anchors between "real-to-fake" transitions. The accumulation of forgery information should be oriented and progressively increasing during this transition process. To this end, we propose an Oriented Progressive Regularizor (OPR) to establish the constraints that compel the distribution of anchors to be discretely arranged. Furthermore, we introduce feature bridging to facilitate the smooth transition between adjacent anchors. Extensive experiments confirm that our design allows leveraging forgery information from both blendfake and deepfake effectively and comprehensively.

Can We Leave Deepfake Data Behind in Training Deepfake Detector?

TL;DR

to quantify latent-space regularity.

Abstract

Paper Structure (26 sections, 10 equations, 8 figures, 7 tables, 1 algorithm)

This paper contains 26 sections, 10 equations, 8 figures, 7 tables, 1 algorithm.

Introduction
Related Works
Deepfake Detection Toward Generalization Ability
Deepfake Detectors with Blendfake Faces
Methodology
Anchoring Oriented Distributions in Latent Space
Simulating Continuous Transition via Feature Bridging
Loss Function
Experiments
Experimental Setting
Overall Performance on Comprehensive Datasets
Ablation Study
Analysis of Learned Feature in Latent Space
Evaluation on Alternative Organized Distribution
Conclusion and Discussions
...and 11 more sections

Figures (8)

Figure 1: \ref{['fig:instr_res']}: The detection performance experiences an abnormal decline when naively combining deepfake and blendfake as the negative sample for training, even though the forgery information is enriched in this process. \ref{['fig:instr_latent']}: Illustration Example for illustrating latent space organization. With progressively organized latent space (ours), information in both deepfake and blendfake is effectively leveraged, and deepfake samples become easier to distinguish from the real. See Fig. \ref{['fig:latent_dist']} and \ref{['fig:supp_tsne']} for experimental actual latent-space distribution.
Figure 2: The progressive transition from real to fake, where blendfake and deepfake are explicitly delineated as the oriented pivot anchors according to their inherent forgery attributes.
Figure 3: Overall pipeline of our method.
Figure 4: \ref{['fig:latent_dist']}: Illustration of feature organization, where our method can organize different anchors in a progressive manner, while VHT is unorganized and fails to discern blendfake and deepfake. \ref{['fig:latent_heat']}: Illustration of feature regularity. The heatmap values represent the PD at each point, and mPD is the mean PD in the distribution, while smaller mPD implies better feature regularity. The results show that our method has a smaller mPD and superior regularity.
Figure 5: The examples for the real-to-fake progressive transition.
...and 3 more figures

Can We Leave Deepfake Data Behind in Training Deepfake Detector?

TL;DR

Abstract

Can We Leave Deepfake Data Behind in Training Deepfake Detector?

Authors

TL;DR

Abstract

Table of Contents

Figures (8)