Analyzing Fairness in Deepfake Detection With Massively Annotated Databases
Ying Xu, Philipp Terhörst, Kiran Raja, Marius Pedersen
TL;DR
This work tackles the fairness of Deepfake detectors by highlighting how biased training data propagate to detection outcomes. It introduces a large-scale annotation pipeline that transfers 47 attributes to five popular Deepfake datasets, yielding about $65.3 ext{M}$ labels, and it analyzes bias across three backbone models on four datasets using 31 attributes. The authors propose a corrected relative performance metric $CRP(a)$ to separate testing-data imbalance from attribute-driven bias, enabling robust bias quantification. Their findings reveal strong attribute-driven biases and limited diversity in the datasets, underscoring the need for balanced data and bias-aware detectors to improve generalizability and security across real-world populations. Public release of the annotated datasets and code provides a valuable resource for future bias mitigation and fair Deepfake-detection research.
Abstract
In recent years, image and video manipulations with Deepfake have become a severe concern for security and society. Many detection models and datasets have been proposed to detect Deepfake data reliably. However, there is an increased concern that these models and training databases might be biased and, thus, cause Deepfake detectors to fail. In this work, we investigate factors causing biased detection in public Deepfake datasets by (a) creating large-scale demographic and non-demographic attribute annotations with 47 different attributes for five popular Deepfake datasets and (b) comprehensively analysing attributes resulting in AI-bias of three state-of-the-art Deepfake detection backbone models on these datasets. The analysis shows how various attributes influence a large variety of distinctive attributes (from over 65M labels) on the detection performance which includes demographic (age, gender, ethnicity) and non-demographic (hair, skin, accessories, etc.) attributes. The results examined datasets show limited diversity and, more importantly, show that the utilised Deepfake detection backbone models are strongly affected by investigated attributes making them not fair across attributes. The Deepfake detection backbone methods trained on such imbalanced/biased datasets result in incorrect detection results leading to generalisability, fairness, and security issues. Our findings and annotated datasets will guide future research to evaluate and mitigate bias in Deepfake detection techniques. The annotated datasets and the corresponding code are publicly available.
