Fairness-Aware Deepfake Detection: Leveraging Dual-Mechanism Optimization
Feng Ding, Wenhui Yi, Yunpeng Zhou, Xinan He, Hong Rao, Shu Hu
TL;DR
This work tackles fairness in deepfake detection by addressing demographic bias across gender and race without sacrificing accuracy. It introduces a dual-mechanism framework combining Structural Fairness Decoupling, which removes demographic leakage by decoupling channels correlated with sensitive attributes, and Global Distribution Alignment, which uses optimal transport with a mutual information constraint to align subgroup distributions with global ones. Experiments on FF++ and cross-domain datasets show improved inter-group and intra-group fairness while maintaining or enhancing detection performance, with ablations confirming the complementary value of the two modules. Visualizations and robustness tests further support that the approach focuses on facial cues and generalizes across backbones, indicating practical potential for fairer deepfake detection systems.
Abstract
Fairness is a core element in the trustworthy deployment of deepfake detection models, especially in the field of digital identity security. Biases in detection models toward different demographic groups, such as gender and race, may lead to systemic misjudgments, exacerbating the digital divide and social inequities. However, current fairness-enhanced detectors often improve fairness at the cost of detection accuracy. To address this challenge, we propose a dual-mechanism collaborative optimization framework. Our proposed method innovatively integrates structural fairness decoupling and global distribution alignment: decoupling channels sensitive to demographic groups at the model architectural level, and subsequently reducing the distance between the overall sample distribution and the distributions corresponding to each demographic group at the feature level. Experimental results demonstrate that, compared with other methods, our framework improves both inter-group and intra-group fairness while maintaining overall detection accuracy across domains.
