Redundant Semantic Environment Filling via Misleading-Learning for Fair Deepfake Detection

Xinan He; Yue Zhou; Shu Hu; Bin Li; Jiwu Huang; Feng Ding

Redundant Semantic Environment Filling via Misleading-Learning for Fair Deepfake Detection

Xinan He, Yue Zhou, Shu Hu, Bin Li, Jiwu Huang, Feng Ding

TL;DR

This work tackles demographic fairness in deepfake detection by addressing dual-overfitting to forgery fingerprints and high-level demographics. It introduces Misleading-Learning, a four-stage framework that enriches the latent space with a rich redundant semantic environment through Redundant Sample Selection, a fixed Redundant Semantic Feature Extractor, Prior Forensics Knowledge via a Forgery Feature Discriminator, and Misleading Semantic Augmentation using Astray-SRM and SCAM with a dedicated auxiliary path. The learning objective combines Misleading Classification, Regularization Contrastive, and Final Classification losses as $L_{misleading} = L^{mis}_{cls} + \alpha L^{mis}_{con} + \beta L^{final}_{cls}$, enabling the model to retain forensic capability while reducing demographic bias. Comprehensive experiments on FF++, CelebDF, DFD, and DFDC show improved fairness metrics such as $F_{MAG}$ and $F_{FPR}$, with robust cross-domain generalization, multiple backbones, and resilience to various preprocessing and disturbances, highlighting the practical impact for trustworthy AI in multimedia forensics.

Abstract

Detecting falsified faces generated by Deepfake technology is essential for safeguarding trust in digital communication and protecting individuals. However, current detectors often suffer from a dual-overfitting: they become overly specialized in both specific forgery fingerprints and particular demographic attributes. Critically, most existing methods overlook the latter issue, which results in poor fairness: faces from certain demographic groups, such as different genders or ethnicities, are consequently more difficult to reliably detect. To address this challenge, we propose a novel strategy called misleading-learning, which populates the latent space with a multitude of redundant environments. By exposing the detector to a sufficiently rich and balanced variety of high-level information for demographic fairness, our approach mitigates demographic bias while maintaining a high detection performance level. We conduct extensive evaluations on fairness, intra-domain detection, cross-domain generalization, and robustness. Experimental results demonstrate that our framework achieves superior fairness and generalization compared to state-of-the-art approaches.

Redundant Semantic Environment Filling via Misleading-Learning for Fair Deepfake Detection

TL;DR

, enabling the model to retain forensic capability while reducing demographic bias. Comprehensive experiments on FF++, CelebDF, DFD, and DFDC show improved fairness metrics such as

and

, with robust cross-domain generalization, multiple backbones, and resilience to various preprocessing and disturbances, highlighting the practical impact for trustworthy AI in multimedia forensics.

Abstract

Paper Structure (24 sections, 1 theorem, 16 equations, 8 figures, 6 tables)

This paper contains 24 sections, 1 theorem, 16 equations, 8 figures, 6 tables.

Introduction
Related Work
Deepfake Detection
Fairness in Deepfake Detection
Demographic Semantic Redundancy Analysis and Motivation
The Root Cause of Demographic Unfairness
Limitations of Existing Fairness Decoupling Methods
Design Philosophy of Misleading Learning
Misleading-Learning
Redundant Sample Selection
Establishing the Redundancy Sentinel
Prior Forensics Knowledge Acquisition
Misleading Semantic Augmentation
Experiments
Settings
...and 9 more sections

Key Result

Theorem 1

(locatello2019fairness) If $X$ is entangled with $I$ and $Y$, the use of a perfect classifier for $\hat{Y}$, i.e., $P(\hat{Y}|X) = P(Y|X)$, does not imply demographic parity, i.e., $P(\hat{Y}=y|I=i_1)=P(\hat{Y} = y|I = i_2)$, $\forall y, i_1, i_2$.

Figures (8)

Figure 1: We compare our method with existing fairness approaches: Disentanglement-based fairness method aims at separating forgery fingerprints and demographic features through disentanglement learning; Our approach, termed Misleading learning, enriches the redundant environment within the feature latent space.
Figure 2: The left of the figure illustrates dataset bias, which stems from the imbalanced demographic distribution of the FF++ dataset. The lower right depicts model bias, resulting from the network's inherent architecture and initialization parameters. The upper right then shows how the combination of these two types of bias leads to the emergence of demographic unfairness.
Figure 3: The first three stages of the Misleading-Learning methodology. (A) Redundant Sample Selection, where a redundant sample with differing demographic labels is chosen from the Shared Redundant Semantic Library based on the input fake image's group. (B) Establishing the Redundancy Sentinel, where the fixed and frozen Redundant Semantic Feature Extractor extracts high-level semantic features from redundant samples using zero-shot pretrained weights. (3) Prior Forensics Knowledge Acquisition, where the Forgery Feature Discriminator is independently trained as a standard binary classifier to acquire basic forgery feature extraction capability using the Binary Cross-Entropy loss.
Figure 4: This is the pipeline for the fourth stage of misleading learning. It includes Astray-SRM for coarse-grained filtering of high-level semantic information, the construction of the Semantic Constraint and Augmentation Module, and the design of the Auxiliary Discriminator.
Figure 5: Detection performance of ours and three other fairness methods across different demographic subgroups within four datasets.
...and 3 more figures

Theorems & Definitions (1)

Theorem 1

Redundant Semantic Environment Filling via Misleading-Learning for Fair Deepfake Detection

TL;DR

Abstract

Redundant Semantic Environment Filling via Misleading-Learning for Fair Deepfake Detection

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (1)