Table of Contents
Fetching ...

Penny-Wise and Pound-Foolish in Deepfake Detection

Yabin Wang, Zhiwu Huang, Su Zhou, Adam Prugel-Bennett, Xiaopeng Hong

TL;DR

This work addresses the limited generalization of deepfake detectors that fine-tune pre-trained models on a single dataset. It introduces PoundNet, a CLIP-based prompt-tuning framework with learnable real/fake prompts and a balanced objective comprising $\mathcal{L}_{bce}$, $\mathcal{L}_{spm}$, and $\mathcal{L}_{cab}$ to preserve upstream knowledge while improving downstream detection. The method demonstrates a ~19% improvement in deepfake detection across 10 benchmarks and maintains ~63% accuracy on zero-shot object classification, validating the proposed balance between generalization and knowledge retention. These results are achieved with efficient prompt-tuning on a large vision-language model, and the authors provide open-source code and data to facilitate reproducibility and broader application beyond deepfake detection.

Abstract

The diffusion of deepfake technologies has sparked serious concerns about its potential misuse across various domains, prompting the urgent need for robust detection methods. Despite advancement, many current approaches prioritize short-term gains at expense of long-term effectiveness. This paper critiques the overly specialized approach of fine-tuning pre-trained models solely with a penny-wise objective on a single deepfake dataset, while disregarding the pound-wise balance for generalization and knowledge retention. To address this "Penny-Wise and Pound-Foolish" issue, we propose a novel learning framework (PoundNet) for generalization of deepfake detection on a pre-trained vision-language model. PoundNet incorporates a learnable prompt design and a balanced objective to preserve broad knowledge from upstream tasks (object classification) while enhancing generalization for downstream tasks (deepfake detection). We train PoundNet on a standard single deepfake dataset, following common practice in the literature. We then evaluate its performance across 10 public large-scale deepfake datasets with 5 main evaluation metrics-forming the largest benchmark test set for assessing the generalization ability of deepfake detection models, to our knowledge. The comprehensive benchmark evaluation demonstrates the proposed PoundNet is significantly less "Penny-Wise and Pound-Foolish", achieving a remarkable improvement of 19% in deepfake detection performance compared to state-of-the-art methods, while maintaining a strong performance of 63% on object classification tasks, where other deepfake detection models tend to be ineffective. Code and data are open-sourced at https://github.com/iamwangyabin/PoundNet.

Penny-Wise and Pound-Foolish in Deepfake Detection

TL;DR

This work addresses the limited generalization of deepfake detectors that fine-tune pre-trained models on a single dataset. It introduces PoundNet, a CLIP-based prompt-tuning framework with learnable real/fake prompts and a balanced objective comprising , , and to preserve upstream knowledge while improving downstream detection. The method demonstrates a ~19% improvement in deepfake detection across 10 benchmarks and maintains ~63% accuracy on zero-shot object classification, validating the proposed balance between generalization and knowledge retention. These results are achieved with efficient prompt-tuning on a large vision-language model, and the authors provide open-source code and data to facilitate reproducibility and broader application beyond deepfake detection.

Abstract

The diffusion of deepfake technologies has sparked serious concerns about its potential misuse across various domains, prompting the urgent need for robust detection methods. Despite advancement, many current approaches prioritize short-term gains at expense of long-term effectiveness. This paper critiques the overly specialized approach of fine-tuning pre-trained models solely with a penny-wise objective on a single deepfake dataset, while disregarding the pound-wise balance for generalization and knowledge retention. To address this "Penny-Wise and Pound-Foolish" issue, we propose a novel learning framework (PoundNet) for generalization of deepfake detection on a pre-trained vision-language model. PoundNet incorporates a learnable prompt design and a balanced objective to preserve broad knowledge from upstream tasks (object classification) while enhancing generalization for downstream tasks (deepfake detection). We train PoundNet on a standard single deepfake dataset, following common practice in the literature. We then evaluate its performance across 10 public large-scale deepfake datasets with 5 main evaluation metrics-forming the largest benchmark test set for assessing the generalization ability of deepfake detection models, to our knowledge. The comprehensive benchmark evaluation demonstrates the proposed PoundNet is significantly less "Penny-Wise and Pound-Foolish", achieving a remarkable improvement of 19% in deepfake detection performance compared to state-of-the-art methods, while maintaining a strong performance of 63% on object classification tasks, where other deepfake detection models tend to be ineffective. Code and data are open-sourced at https://github.com/iamwangyabin/PoundNet.
Paper Structure (10 sections, 8 equations, 15 figures, 31 tables)

This paper contains 10 sections, 8 equations, 15 figures, 31 tables.

Figures (15)

  • Figure 1: "Penny" and "Pound" in previous methods and proposed approach (PoundNet) for deepfake detection and beyond. PoundNet improves deepfake detection (downstream) by 19% across 10 datasets and maintains a strong 63% performance in zero-shot object detection (upstream) across 5 datasets where many current detectors struggle.
  • Figure 2: Proposed framework (PoundNet) on a pre-trained vision-language model with a learnable prompt pair (Top Left) and a balanced object (Bottom Left) with three basic loss components: (a) class-agnostic binary, (b) semantic-preserving, and (c) class-aware binary terms.
  • Figure 3: Feature spaces for in-domain deepfakes (produced by ProGAN) and three unseen deepfakes (generated by LDM, DALL-E2, and Deepfake). Reals: [CLASS]/LSUN, [CLASS]/CelebA.
  • Figure 4: Precision-Recall Curves of deepfake detection methods on the DIF dataset. The numbers on each curve represent the decision thresholds that define the boundary between positive and negative predictions. The numbers typically fall within a narrow range.
  • Figure 5: F1 Curves with different thresholds of logits of deepfake detection methods on various deepfakes on the DIF dataset. The horizontal axis denotes the decision thresholds that determine the boundary between positive and negative predictions.
  • ...and 10 more figures