Table of Contents
Fetching ...

WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection

Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, Yu-Gang Jiang

TL;DR

WildDeepfake addresses the gap between laboratory-created deepfakes and real-world online footage by providing a diverse, web-sourced benchmark. The authors propose Attention-based Deepfake Detection Networks (ADDNets) that use multi-layer attention masks to reweight features for both image- and sequence-level detection. Through systematic experiments on six datasets, the study shows existing detectors underperform on WildDeepfake while ADDNets recover a substantial portion of this gap, especially in image-level detection. The dataset and methods offer a practical path toward robust deepfake defenses in real-world settings.

Abstract

In recent years, the abuse of a face swap technique called deepfake has raised enormous public concerns. So far, a large number of deepfake videos (known as "deepfakes") have been crafted and uploaded to the internet, calling for effective countermeasures. One promising countermeasure against deepfakes is deepfake detection. Several deepfake datasets have been released to support the training and testing of deepfake detectors, such as DeepfakeDetection and FaceForensics++. While this has greatly advanced deepfake detection, most of the real videos in these datasets are filmed with a few volunteer actors in limited scenes, and the fake videos are crafted by researchers using a few popular deepfake softwares. Detectors developed on these datasets may become less effective against real-world deepfakes on the internet. To better support detection against real-world deepfakes, in this paper, we introduce a new dataset WildDeepfake which consists of 7,314 face sequences extracted from 707 deepfake videos collected completely from the internet. WildDeepfake is a small dataset that can be used, in addition to existing datasets, to develop and test the effectiveness of deepfake detectors against real-world deepfakes. We conduct a systematic evaluation of a set of baseline detection networks on both existing and our WildDeepfake datasets, and show that WildDeepfake is indeed a more challenging dataset, where the detection performance can decrease drastically. We also propose two (eg. 2D and 3D) Attention-based Deepfake Detection Networks (ADDNets) to leverage the attention masks on real/fake faces for improved detection. We empirically verify the effectiveness of ADDNets on both existing datasets and WildDeepfake. The dataset is available at: https://github.com/OpenTAI/wild-deepfake.

WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection

TL;DR

WildDeepfake addresses the gap between laboratory-created deepfakes and real-world online footage by providing a diverse, web-sourced benchmark. The authors propose Attention-based Deepfake Detection Networks (ADDNets) that use multi-layer attention masks to reweight features for both image- and sequence-level detection. Through systematic experiments on six datasets, the study shows existing detectors underperform on WildDeepfake while ADDNets recover a substantial portion of this gap, especially in image-level detection. The dataset and methods offer a practical path toward robust deepfake defenses in real-world settings.

Abstract

In recent years, the abuse of a face swap technique called deepfake has raised enormous public concerns. So far, a large number of deepfake videos (known as "deepfakes") have been crafted and uploaded to the internet, calling for effective countermeasures. One promising countermeasure against deepfakes is deepfake detection. Several deepfake datasets have been released to support the training and testing of deepfake detectors, such as DeepfakeDetection and FaceForensics++. While this has greatly advanced deepfake detection, most of the real videos in these datasets are filmed with a few volunteer actors in limited scenes, and the fake videos are crafted by researchers using a few popular deepfake softwares. Detectors developed on these datasets may become less effective against real-world deepfakes on the internet. To better support detection against real-world deepfakes, in this paper, we introduce a new dataset WildDeepfake which consists of 7,314 face sequences extracted from 707 deepfake videos collected completely from the internet. WildDeepfake is a small dataset that can be used, in addition to existing datasets, to develop and test the effectiveness of deepfake detectors against real-world deepfakes. We conduct a systematic evaluation of a set of baseline detection networks on both existing and our WildDeepfake datasets, and show that WildDeepfake is indeed a more challenging dataset, where the detection performance can decrease drastically. We also propose two (eg. 2D and 3D) Attention-based Deepfake Detection Networks (ADDNets) to leverage the attention masks on real/fake faces for improved detection. We empirically verify the effectiveness of ADDNets on both existing datasets and WildDeepfake. The dataset is available at: https://github.com/OpenTAI/wild-deepfake.

Paper Structure

This paper contains 19 sections, 2 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: WildDeepfake: a challenging real-word dataset for deepfake detection.
  • Figure 2: Illustration of the face swap process.
  • Figure 3: WildDeepfake versus 5 existing datasets. There are more diverse scenes in WildDeepfake and the fake faces look more realistic, reflecting the challenging real-world scenario. To protect privacy, we block the eye regions of the fake images.
  • Figure 4: A feature perspective comparison of 6 deepfake datasets. We use an ImageNet-pretrained ResNetV2-101 network to extract features and t-SNE maaten2008visualizing for dimensionality reduction.
  • Figure 5: The structures of our ADDNet detection networks. The input size of 2D ADDNet is $W \times H \times C$, and that of the 3D ADDNet is $L \times W\times H \times C$: $W$: input width, $H$: input height, $C$: the number of channels, and $L$: sequence length.
  • ...and 1 more figures