Table of Contents
Fetching ...

The Deepfake Detection Challenge (DFDC) Preview Dataset

Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, Cristian Canton Ferrer

TL;DR

This paper presents a preview of the Deepfake Detection Challenge (DFDC) dataset, a large, diverse collection of videos with facial manipulations created under actor consent. It details data collection and preprocessing to ensure variability, defines evaluation metrics that reflect real-world prevalence using weighted precision, and establishes simple baseline detectors (TamperNet and XceptionNet variants) to offer reference performance. The preview demonstrates dataset properties, processing pipelines, and baseline performance to guide subsequent DFDC releases. The work aims to accelerate research in deepfake detection by providing a realistic, benchmark-driven resource and clear evaluation criteria.

Abstract

In this paper, we introduce a preview of the Deepfakes Detection Challenge (DFDC) dataset consisting of 5K videos featuring two facial modification algorithms. A data collection campaign has been carried out where participating actors have entered into an agreement to the use and manipulation of their likenesses in our creation of the dataset. Diversity in several axes (gender, skin-tone, age, etc.) has been considered and actors recorded videos with arbitrary backgrounds thus bringing visual variability. Finally, a set of specific metrics to evaluate the performance have been defined and two existing models for detecting deepfakes have been tested to provide a reference performance baseline. The DFDC dataset preview can be downloaded at: deepfakedetectionchallenge.ai

The Deepfake Detection Challenge (DFDC) Preview Dataset

TL;DR

This paper presents a preview of the Deepfake Detection Challenge (DFDC) dataset, a large, diverse collection of videos with facial manipulations created under actor consent. It details data collection and preprocessing to ensure variability, defines evaluation metrics that reflect real-world prevalence using weighted precision, and establishes simple baseline detectors (TamperNet and XceptionNet variants) to offer reference performance. The preview demonstrates dataset properties, processing pipelines, and baseline performance to guide subsequent DFDC releases. The work aims to accelerate research in deepfake detection by providing a realistic, benchmark-driven resource and clear evaluation criteria.

Abstract

In this paper, we introduce a preview of the Deepfakes Detection Challenge (DFDC) dataset consisting of 5K videos featuring two facial modification algorithms. A data collection campaign has been carried out where participating actors have entered into an agreement to the use and manipulation of their likenesses in our creation of the dataset. Diversity in several axes (gender, skin-tone, age, etc.) has been considered and actors recorded videos with arbitrary backgrounds thus bringing visual variability. Finally, a set of specific metrics to evaluate the performance have been defined and two existing models for detecting deepfakes have been tested to provide a reference performance baseline. The DFDC dataset preview can be downloaded at: deepfakedetectionchallenge.ai

Paper Structure

This paper contains 5 sections, 1 equation, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Some example face swaps from the dataset.