Table of Contents
Fetching ...

Exploring Unbiased Deepfake Detection via Token-Level Shuffling and Mixing

Xinghe Fu, Zhiyuan Yan, Taiping Yao, Shen Chen, Xi Li

TL;DR

This work addresses the generalization gap in deepfake detection by identifying position bias and content bias as key spurious factors. It introduces UDD, a plug-and-play framework that applies token-level shuffling and mixing in vision transformers, coupled with feature-level contrastive loss and logit-level alignment to learn unbiased representations. A causal analysis supports the interventions as blocking backdoor paths between biases and labels, yielding detectors that generalize better across unseen datasets. Extensive experiments across FF++-based and external datasets demonstrate improved cross-dataset AUC and robustness, with ablations confirming the contributions of both branches and the alignment losses. The approach offers a practical, scalable path to more reliable deepfake detection in real-world deployments, especially when encountering distribution shifts.

Abstract

The generalization problem is broadly recognized as a critical challenge in detecting deepfakes. Most previous work believes that the generalization gap is caused by the differences among various forgery methods. However, our investigation reveals that the generalization issue can still occur when forgery-irrelevant factors shift. In this work, we identify two biases that detectors may also be prone to overfitting: position bias and content bias, as depicted in Fig. 1. For the position bias, we observe that detectors are prone to lazily depending on the specific positions within an image (e.g., central regions even no forgery). As for content bias, we argue that detectors may potentially and mistakenly utilize forgery-unrelated information for detection (e.g., background, and hair). To intervene these biases, we propose two branches for shuffling and mixing with tokens in the latent space of transformers. For the shuffling branch, we rearrange the tokens and corresponding position embedding for each image while maintaining the local correlation. For the mixing branch, we randomly select and mix the tokens in the latent space between two images with the same label within the mini-batch to recombine the content information. During the learning process, we align the outputs of detectors from different branches in both feature space and logit space. Contrastive losses for features and divergence losses for logits are applied to obtain unbiased feature representation and classifiers. We demonstrate and verify the effectiveness of our method through extensive experiments on widely used evaluation datasets.

Exploring Unbiased Deepfake Detection via Token-Level Shuffling and Mixing

TL;DR

This work addresses the generalization gap in deepfake detection by identifying position bias and content bias as key spurious factors. It introduces UDD, a plug-and-play framework that applies token-level shuffling and mixing in vision transformers, coupled with feature-level contrastive loss and logit-level alignment to learn unbiased representations. A causal analysis supports the interventions as blocking backdoor paths between biases and labels, yielding detectors that generalize better across unseen datasets. Extensive experiments across FF++-based and external datasets demonstrate improved cross-dataset AUC and robustness, with ablations confirming the contributions of both branches and the alignment losses. The approach offers a practical, scalable path to more reliable deepfake detection in real-world deployments, especially when encountering distribution shifts.

Abstract

The generalization problem is broadly recognized as a critical challenge in detecting deepfakes. Most previous work believes that the generalization gap is caused by the differences among various forgery methods. However, our investigation reveals that the generalization issue can still occur when forgery-irrelevant factors shift. In this work, we identify two biases that detectors may also be prone to overfitting: position bias and content bias, as depicted in Fig. 1. For the position bias, we observe that detectors are prone to lazily depending on the specific positions within an image (e.g., central regions even no forgery). As for content bias, we argue that detectors may potentially and mistakenly utilize forgery-unrelated information for detection (e.g., background, and hair). To intervene these biases, we propose two branches for shuffling and mixing with tokens in the latent space of transformers. For the shuffling branch, we rearrange the tokens and corresponding position embedding for each image while maintaining the local correlation. For the mixing branch, we randomly select and mix the tokens in the latent space between two images with the same label within the mini-batch to recombine the content information. During the learning process, we align the outputs of detectors from different branches in both feature space and logit space. Contrastive losses for features and divergence losses for logits are applied to obtain unbiased feature representation and classifiers. We demonstrate and verify the effectiveness of our method through extensive experiments on widely used evaluation datasets.
Paper Structure (31 sections, 6 equations, 11 figures, 12 tables)

This paper contains 31 sections, 6 equations, 11 figures, 12 tables.

Figures (11)

  • Figure 1: We present two identified biases in deepfake detection: (1) position bias and (2) content bias. (a) For position bias, we discover that the detector may focus more on the image center, regardless of whether the forgery is present. For content bias, we observe that detectors mistakenly concentrate on the forgery-irrelevant features (e.g., background and hair). These biases can cause spurious correlations and lead to a biased detector; (b) Our method intervenes on the biases and helps establish a more robust model.
  • Figure 2: The overall pipeline of the proposed framework. The input image is sent to the original, shuffling and mixing branch during training. The shuffling branch (S-Branch) introduces the random intervention on position information with the shuffle module at the embedding layer. The mixing branch (M-Branch) introduces the intervention on content information with the mix module between randomly selected blocks. Both operations are applied to token-level representations in the latent space. All branches share the parameters of the network. Contrastive loss $\mathcal{L}_{con}$ and alignment loss $\mathcal{L}_{align}$ are applied over branches to attain unbiased forgery representation and classifier.
  • Figure 3: Causal graph for illustrating the proposed framework (the right) versus the baseline (the left). Within the graph, $X$ and $Y$ are observed. The unobserved confounder $U$ causes a backdoor path from the position ($Z_p$) and content ($Z_c$) variables to the label $Y$. Differing from the baseline, our proposed unbiased learning method performs an intervention that blocks the backdoor paths for training an unbiased detector.
  • Figure 4: The results of robustness evaluation on the test set of FF++ (c23). Video-level AUC (%) is reported under five different types of perturbations following jiang2020deeperforensics. Our method is more robust than previous methods across corruptions.
  • Figure 5: The visualization of multi-head attentions. For visualization, we select attention maps with clear activation in the last layer of ViT models. Squares and circles indicate the position bias and content bias in the baseline model.
  • ...and 6 more figures