Table of Contents
Fetching ...

A Quality-Centric Framework for Generic Deepfake Detection

Wentang Song, Zhiyuan Yan, Yuzhen Lin, Taiping Yao, Changsheng Chen, Shen Chen, Yandan Zhao, Shouhong Ding, Bin Li

TL;DR

The paper addresses the generalization gap in deepfake detection under unseen manipulation methods by introducing a quality-centric framework that integrates Forgery Quality Score (FQS), Frequency Data Augmentation (FreDA), and a curriculum-inspired learning pacing. FQS blends static (ArcFace-based cosine similarity) and dynamic (loss-based) hardness into $FQS_t^i = d_t^i + \\alpha_f q^i$, guiding sample selection, while FreDA produces more realistic low-quality fakes via frequency-domain fusion $F_a = F_r \\otimes M + F_f \\otimes (1-M)$ and $x_a = \\text{iFFT}(F_a)$. A pacing function organizes training by gradually exposing harder samples through $\\mathcal{X}'_t = H_t \\cup FreDA(E_t)$, enabling progressive learning. Across cross-dataset and cross-manipulation benchmarks, the method yields substantial gains (roughly 4–10% in AUC) and consistently boosts generalization when plugged into existing detectors, demonstrating practical impact for robust deepfake detection.

Abstract

Detecting AI-generated images, particularly deepfakes, has become increasingly crucial, with the primary challenge being the generalization to previously unseen manipulation methods. This paper tackles this issue by leveraging the forgery quality of training data to improve the generalization performance of existing deepfake detectors. Generally, the forgery quality of different deepfakes varies: some have easily recognizable forgery clues, while others are highly realistic. Existing works often train detectors on a mix of deepfakes with varying forgery qualities, potentially leading detectors to short-cut the easy-to-spot artifacts from low-quality forgery samples, thereby hurting generalization performance. To tackle this issue, we propose a novel quality-centric framework for generic deepfake detection, which is composed of a Quality Evaluator, a low-quality data enhancement module, and a learning pacing strategy that explicitly incorporates forgery quality into the training process. Our framework is inspired by curriculum learning, which is designed to gradually enable the detector to learn more challenging deepfake samples, starting with easier samples and progressing to more realistic ones. We employ both static and dynamic assessments to assess the forgery quality, combining their scores to produce a final rating for each training sample. The rating score guides the selection of deepfake samples for training, with higher-rated samples having a higher probability of being chosen. Furthermore, we propose a novel frequency data augmentation method specifically designed for low-quality forgery samples, which helps to reduce obvious forgery traces and improve their overall realism. Extensive experiments demonstrate that our proposed framework can be applied plug-and-play to existing detection models and significantly enhance their generalization performance in detection.

A Quality-Centric Framework for Generic Deepfake Detection

TL;DR

The paper addresses the generalization gap in deepfake detection under unseen manipulation methods by introducing a quality-centric framework that integrates Forgery Quality Score (FQS), Frequency Data Augmentation (FreDA), and a curriculum-inspired learning pacing. FQS blends static (ArcFace-based cosine similarity) and dynamic (loss-based) hardness into , guiding sample selection, while FreDA produces more realistic low-quality fakes via frequency-domain fusion and . A pacing function organizes training by gradually exposing harder samples through , enabling progressive learning. Across cross-dataset and cross-manipulation benchmarks, the method yields substantial gains (roughly 4–10% in AUC) and consistently boosts generalization when plugged into existing detectors, demonstrating practical impact for robust deepfake detection.

Abstract

Detecting AI-generated images, particularly deepfakes, has become increasingly crucial, with the primary challenge being the generalization to previously unseen manipulation methods. This paper tackles this issue by leveraging the forgery quality of training data to improve the generalization performance of existing deepfake detectors. Generally, the forgery quality of different deepfakes varies: some have easily recognizable forgery clues, while others are highly realistic. Existing works often train detectors on a mix of deepfakes with varying forgery qualities, potentially leading detectors to short-cut the easy-to-spot artifacts from low-quality forgery samples, thereby hurting generalization performance. To tackle this issue, we propose a novel quality-centric framework for generic deepfake detection, which is composed of a Quality Evaluator, a low-quality data enhancement module, and a learning pacing strategy that explicitly incorporates forgery quality into the training process. Our framework is inspired by curriculum learning, which is designed to gradually enable the detector to learn more challenging deepfake samples, starting with easier samples and progressing to more realistic ones. We employ both static and dynamic assessments to assess the forgery quality, combining their scores to produce a final rating for each training sample. The rating score guides the selection of deepfake samples for training, with higher-rated samples having a higher probability of being chosen. Furthermore, we propose a novel frequency data augmentation method specifically designed for low-quality forgery samples, which helps to reduce obvious forgery traces and improve their overall realism. Extensive experiments demonstrate that our proposed framework can be applied plug-and-play to existing detection models and significantly enhance their generalization performance in detection.

Paper Structure

This paper contains 17 sections, 4 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Impact of forgery quality on deepfake detection. (a) Qualitative assessment of forgery quality from two perspectives: swapping pairs (static) and model feedback (dynamic); (b) Quantitative analysis of how different forgery quality levels affect model generalization performance.
  • Figure 2: Overall pipeline of the proposed method. To improve the existing data, we classify the samples through the Quality Evaluator module. For low-quality data, we use the FreDA module as shown Figure \ref{['freda']}, to improve the quality of those samples.
  • Figure 3: For low-quality data, we employ the FreDA module to augment their forgery quality by reducing the easily recognizable artifacts and enhancing their realism.
  • Figure 4: Performance comparison of cross-manipulation with other SOTA detectors under the latest DF40 dataset.
  • Figure 5: Ablation studies on the proposed FreDA and Pacing Function. (a) Exploration of FreDA's extensibility; (b) The impact of different training strategies under the cross-dataset setting.