Table of Contents
Fetching ...

Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture

Chenqi Kong, Anwei Luo, Peijun Bao, Haoliang Li, Renjie Wan, Zengwei Zheng, Anderson Rocha, Alex C. Kot

TL;DR

A parameter-efficient ViT-based detection model that includes lightweight forgery feature extraction modules and enables the model to extract global and local forgery clues simultaneously, representing an important step toward open-set Deepfake detection in the wild.

Abstract

Open-set face forgery detection poses significant security threats and presents substantial challenges for existing detection models. These detectors primarily have two limitations: they cannot generalize across unknown forgery domains and inefficiently adapt to new data. To address these issues, we introduce an approach that is both general and parameter-efficient for face forgery detection. It builds on the assumption that different forgery source domains exhibit distinct style statistics. Previous methods typically require fully fine-tuning pre-trained networks, consuming substantial time and computational resources. In turn, we design a forgery-style mixture formulation that augments the diversity of forgery source domains, enhancing the model's generalizability across unseen domains. Drawing on recent advancements in vision transformers (ViT) for face forgery detection, we develop a parameter-efficient ViT-based detection model that includes lightweight forgery feature extraction modules and enables the model to extract global and local forgery clues simultaneously. We only optimize the inserted lightweight modules during training, maintaining the original ViT structure with its pre-trained ImageNet weights. This training strategy effectively preserves the informative pre-trained knowledge while flexibly adapting the model to the task of Deepfake detection. Extensive experimental results demonstrate that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters, representing an important step toward open-set Deepfake detection in the wild.

Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture

TL;DR

A parameter-efficient ViT-based detection model that includes lightweight forgery feature extraction modules and enables the model to extract global and local forgery clues simultaneously, representing an important step toward open-set Deepfake detection in the wild.

Abstract

Open-set face forgery detection poses significant security threats and presents substantial challenges for existing detection models. These detectors primarily have two limitations: they cannot generalize across unknown forgery domains and inefficiently adapt to new data. To address these issues, we introduce an approach that is both general and parameter-efficient for face forgery detection. It builds on the assumption that different forgery source domains exhibit distinct style statistics. Previous methods typically require fully fine-tuning pre-trained networks, consuming substantial time and computational resources. In turn, we design a forgery-style mixture formulation that augments the diversity of forgery source domains, enhancing the model's generalizability across unseen domains. Drawing on recent advancements in vision transformers (ViT) for face forgery detection, we develop a parameter-efficient ViT-based detection model that includes lightweight forgery feature extraction modules and enables the model to extract global and local forgery clues simultaneously. We only optimize the inserted lightweight modules during training, maintaining the original ViT structure with its pre-trained ImageNet weights. This training strategy effectively preserves the informative pre-trained knowledge while flexibly adapting the model to the task of Deepfake detection. Extensive experimental results demonstrate that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters, representing an important step toward open-set Deepfake detection in the wild.
Paper Structure (31 sections, 13 equations, 12 figures, 13 tables)

This paper contains 31 sections, 13 equations, 12 figures, 13 tables.

Figures (12)

  • Figure 1: False Positive Rates (FPR) and False Negative Rates (FNR) of Xception, EfficientNet-B4, and ViT-B on four unforeseen Deepfake datasets: DFDC, DFR, WDF, and FFIW.
  • Figure 2: Prediction score distributions of real and fake faces of four unseen Deepfake datasets: DFDC, DFR, WDF, and FFIW. The three rows represent three Deepfake detectors: Xception, Efficient Net, and ViT-B. It can be observed that domain gaps between different Deepfake datasets primarily impact the detection of forgery faces rather than real faces.
  • Figure 3: Average cross-dataset AUC score across six unseen datasets vs. the number of activated parameters. Our OSDFD method achieves the best generalizability with the fewest trainable parameters.
  • Figure 4: Overview of the designed face forgery detection framework. (a) Overall model structure; (b) Structure of the designed transformer block; (c) Details of the designed adapter layer; (d) Details of the designed LoRA layer.
  • Figure 5: Illustration of the Central-Difference Convolution (CDC) pipeline.
  • ...and 7 more figures