Redundant Semantic Environment Filling via Misleading-Learning for Fair Deepfake Detection
Xinan He, Yue Zhou, Shu Hu, Bin Li, Jiwu Huang, Feng Ding
TL;DR
This work tackles demographic fairness in deepfake detection by addressing dual-overfitting to forgery fingerprints and high-level demographics. It introduces Misleading-Learning, a four-stage framework that enriches the latent space with a rich redundant semantic environment through Redundant Sample Selection, a fixed Redundant Semantic Feature Extractor, Prior Forensics Knowledge via a Forgery Feature Discriminator, and Misleading Semantic Augmentation using Astray-SRM and SCAM with a dedicated auxiliary path. The learning objective combines Misleading Classification, Regularization Contrastive, and Final Classification losses as $L_{misleading} = L^{mis}_{cls} + \alpha L^{mis}_{con} + \beta L^{final}_{cls}$, enabling the model to retain forensic capability while reducing demographic bias. Comprehensive experiments on FF++, CelebDF, DFD, and DFDC show improved fairness metrics such as $F_{MAG}$ and $F_{FPR}$, with robust cross-domain generalization, multiple backbones, and resilience to various preprocessing and disturbances, highlighting the practical impact for trustworthy AI in multimedia forensics.
Abstract
Detecting falsified faces generated by Deepfake technology is essential for safeguarding trust in digital communication and protecting individuals. However, current detectors often suffer from a dual-overfitting: they become overly specialized in both specific forgery fingerprints and particular demographic attributes. Critically, most existing methods overlook the latter issue, which results in poor fairness: faces from certain demographic groups, such as different genders or ethnicities, are consequently more difficult to reliably detect. To address this challenge, we propose a novel strategy called misleading-learning, which populates the latent space with a multitude of redundant environments. By exposing the detector to a sufficiently rich and balanced variety of high-level information for demographic fairness, our approach mitigates demographic bias while maintaining a high detection performance level. We conduct extensive evaluations on fairness, intra-domain detection, cross-domain generalization, and robustness. Experimental results demonstrate that our framework achieves superior fairness and generalization compared to state-of-the-art approaches.
