Performance Decay in Deepfake Detection: The Limitations of Training on Outdated Data
Jack Richings, Margaux Leblanc, Ian Groves, Victoria Nockles
TL;DR
This work addresses the rapid obsolescence of deepfake detectors as generative techniques improve. It presents a two-stage CNN+RNN detector trained on the DeepSpeak dataset and evaluates cross-version generalization, including fine-tuning with limited new data. The findings show AUROC remains high on current data but drops significantly when faced with six months newer deepfakes, with recall for deepfakes decreasing by over 30%, indicating strong concept drift. The study highlights that robust detection relies primarily on frame-level features, not temporal cues, and underscores the importance of rapid, diverse data collection and evaluation to sustain detector effectiveness in practice.
Abstract
The continually advancing quality of deepfake technology exacerbates the threats of disinformation, fraud, and harassment by making maliciously-generated synthetic content increasingly difficult to distinguish from reality. We introduce a simple yet effective two-stage detection method that achieves an AUROC of over 99.8% on contemporary deepfakes. However, this high performance is short-lived. We show that models trained on this data suffer a recall drop of over 30% when evaluated on deepfakes created with generation techniques from just six months later, demonstrating significant decay as threats evolve. Our analysis reveals two key insights for robust detection. Firstly, continued performance requires the ongoing curation of large, diverse datasets. Second, predictive power comes primarily from static, frame-level artifacts, not temporal inconsistencies. The future of effective deepfake detection therefore depends on rapid data collection and the development of advanced frame-level feature detectors.
