Table of Contents
Fetching ...

Redundant Semantic Environment Filling via Misleading-Learning for Fair Deepfake Detection

Xinan He, Yue Zhou, Shu Hu, Bin Li, Jiwu Huang, Feng Ding

TL;DR

This work tackles demographic fairness in deepfake detection by addressing dual-overfitting to forgery fingerprints and high-level demographics. It introduces Misleading-Learning, a four-stage framework that enriches the latent space with a rich redundant semantic environment through Redundant Sample Selection, a fixed Redundant Semantic Feature Extractor, Prior Forensics Knowledge via a Forgery Feature Discriminator, and Misleading Semantic Augmentation using Astray-SRM and SCAM with a dedicated auxiliary path. The learning objective combines Misleading Classification, Regularization Contrastive, and Final Classification losses as $L_{misleading} = L^{mis}_{cls} + \alpha L^{mis}_{con} + \beta L^{final}_{cls}$, enabling the model to retain forensic capability while reducing demographic bias. Comprehensive experiments on FF++, CelebDF, DFD, and DFDC show improved fairness metrics such as $F_{MAG}$ and $F_{FPR}$, with robust cross-domain generalization, multiple backbones, and resilience to various preprocessing and disturbances, highlighting the practical impact for trustworthy AI in multimedia forensics.

Abstract

Detecting falsified faces generated by Deepfake technology is essential for safeguarding trust in digital communication and protecting individuals. However, current detectors often suffer from a dual-overfitting: they become overly specialized in both specific forgery fingerprints and particular demographic attributes. Critically, most existing methods overlook the latter issue, which results in poor fairness: faces from certain demographic groups, such as different genders or ethnicities, are consequently more difficult to reliably detect. To address this challenge, we propose a novel strategy called misleading-learning, which populates the latent space with a multitude of redundant environments. By exposing the detector to a sufficiently rich and balanced variety of high-level information for demographic fairness, our approach mitigates demographic bias while maintaining a high detection performance level. We conduct extensive evaluations on fairness, intra-domain detection, cross-domain generalization, and robustness. Experimental results demonstrate that our framework achieves superior fairness and generalization compared to state-of-the-art approaches.

Redundant Semantic Environment Filling via Misleading-Learning for Fair Deepfake Detection

TL;DR

This work tackles demographic fairness in deepfake detection by addressing dual-overfitting to forgery fingerprints and high-level demographics. It introduces Misleading-Learning, a four-stage framework that enriches the latent space with a rich redundant semantic environment through Redundant Sample Selection, a fixed Redundant Semantic Feature Extractor, Prior Forensics Knowledge via a Forgery Feature Discriminator, and Misleading Semantic Augmentation using Astray-SRM and SCAM with a dedicated auxiliary path. The learning objective combines Misleading Classification, Regularization Contrastive, and Final Classification losses as , enabling the model to retain forensic capability while reducing demographic bias. Comprehensive experiments on FF++, CelebDF, DFD, and DFDC show improved fairness metrics such as and , with robust cross-domain generalization, multiple backbones, and resilience to various preprocessing and disturbances, highlighting the practical impact for trustworthy AI in multimedia forensics.

Abstract

Detecting falsified faces generated by Deepfake technology is essential for safeguarding trust in digital communication and protecting individuals. However, current detectors often suffer from a dual-overfitting: they become overly specialized in both specific forgery fingerprints and particular demographic attributes. Critically, most existing methods overlook the latter issue, which results in poor fairness: faces from certain demographic groups, such as different genders or ethnicities, are consequently more difficult to reliably detect. To address this challenge, we propose a novel strategy called misleading-learning, which populates the latent space with a multitude of redundant environments. By exposing the detector to a sufficiently rich and balanced variety of high-level information for demographic fairness, our approach mitigates demographic bias while maintaining a high detection performance level. We conduct extensive evaluations on fairness, intra-domain detection, cross-domain generalization, and robustness. Experimental results demonstrate that our framework achieves superior fairness and generalization compared to state-of-the-art approaches.
Paper Structure (24 sections, 1 theorem, 16 equations, 8 figures, 6 tables)

This paper contains 24 sections, 1 theorem, 16 equations, 8 figures, 6 tables.

Key Result

Theorem 1

(locatello2019fairness) If $X$ is entangled with $I$ and $Y$, the use of a perfect classifier for $\hat{Y}$, i.e., $P(\hat{Y}|X) = P(Y|X)$, does not imply demographic parity, i.e., $P(\hat{Y}=y|I=i_1)=P(\hat{Y} = y|I = i_2)$, $\forall y, i_1, i_2$.

Figures (8)

  • Figure 1: We compare our method with existing fairness approaches: Disentanglement-based fairness method aims at separating forgery fingerprints and demographic features through disentanglement learning; Our approach, termed Misleading learning, enriches the redundant environment within the feature latent space.
  • Figure 2: The left of the figure illustrates dataset bias, which stems from the imbalanced demographic distribution of the FF++ dataset. The lower right depicts model bias, resulting from the network's inherent architecture and initialization parameters. The upper right then shows how the combination of these two types of bias leads to the emergence of demographic unfairness.
  • Figure 3: The first three stages of the Misleading-Learning methodology. (A) Redundant Sample Selection, where a redundant sample with differing demographic labels is chosen from the Shared Redundant Semantic Library based on the input fake image's group. (B) Establishing the Redundancy Sentinel, where the fixed and frozen Redundant Semantic Feature Extractor extracts high-level semantic features from redundant samples using zero-shot pretrained weights. (3) Prior Forensics Knowledge Acquisition, where the Forgery Feature Discriminator is independently trained as a standard binary classifier to acquire basic forgery feature extraction capability using the Binary Cross-Entropy loss.
  • Figure 4: This is the pipeline for the fourth stage of misleading learning. It includes Astray-SRM for coarse-grained filtering of high-level semantic information, the construction of the Semantic Constraint and Augmentation Module, and the design of the Auxiliary Discriminator.
  • Figure 5: Detection performance of ours and three other fairness methods across different demographic subgroups within four datasets.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Theorem 1