Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection
Yuzhen Lin, Wentang Song, Bin Li, Yuezun Li, Jiangqun Ni, Han Chen, Qiushi Li
TL;DR
This work tackles the challenge of deepfake detectors failing to generalize to unseen datasets and manipulation methods. It introduces Curricular Dynamic Forgery Augmentation (CDFA), a joint detector-policy framework that progressively shifts training from original to pseudo-fake samples using a Monotonic Curriculum and adapts augmentation choices through a Dynamic Forgery Search, complemented by the Self-shifted Blending Image to simulate temporal artifacts. Through a bi-level optimization, CDFA jointly trains the detector and a lightweight policy network to maximize generalization, with three forgery-augmentation operations (including SSBI) guiding pseudo-fake generation. Experiments across FF++ and external datasets show substantial improvements in cross-dataset and cross-manipulation scenarios, with results that surpass several state-of-the-art methods and demonstrate robustness across backbones. The approach offers a practical, plug-and-play enhancement for deepfake detection systems, while also providing insights into how curriculum and augmentation policy dynamics influence model generalization.
Abstract
Previous studies in deepfake detection have shown promising results when testing face forgeries from the same dataset as the training. However, the problem remains challenging when one tries to generalize the detector to forgeries from unseen datasets and created by unseen methods. In this work, we present a novel general deepfake detection method, called \textbf{C}urricular \textbf{D}ynamic \textbf{F}orgery \textbf{A}ugmentation (CDFA), which jointly trains a deepfake detector with a forgery augmentation policy network. Unlike the previous works, we propose to progressively apply forgery augmentations following a monotonic curriculum during the training. We further propose a dynamic forgery searching strategy to select one suitable forgery augmentation operation for each image varying between training stages, producing a forgery augmentation policy optimized for better generalization. In addition, we propose a novel forgery augmentation named self-shifted blending image to simply imitate the temporal inconsistency of deepfake generation. Comprehensive experiments show that CDFA can significantly improve both cross-datasets and cross-manipulations performances of various naive deepfake detectors in a plug-and-play way, and make them attain superior performances over the existing methods in several benchmark datasets.
