DepthFake: a depth-based strategy for detecting Deepfake videos

Luca Maiano; Lorenzo Papa; Ketbjano Vocaj; Irene Amerini

DepthFake: a depth-based strategy for detecting Deepfake videos

Luca Maiano, Lorenzo Papa, Ketbjano Vocaj, Irene Amerini

TL;DR

Here, the effective contribution of depth-maps to the deepfake detection task on robust pre-trained architectures is demonstrated and the proposed RGBD approach is in fact able to achieve an average improvement of 3.20% and up to 11.7% for some deepfake attacks with respect to standard RGB architectures over the FaceForensic++ dataset.

Abstract

Fake content has grown at an incredible rate over the past few years. The spread of social media and online platforms makes their dissemination on a large scale increasingly accessible by malicious actors. In parallel, due to the growing diffusion of fake image generation methods, many Deep Learning-based detection techniques have been proposed. Most of those methods rely on extracting salient features from RGB images to detect through a binary classifier if the image is fake or real. In this paper, we proposed DepthFake, a study on how to improve classical RGB-based approaches with depth-maps. The depth information is extracted from RGB images with recent monocular depth estimation techniques. Here, we demonstrate the effective contribution of depth-maps to the deepfake detection task on robust pre-trained architectures. The proposed RGBD approach is in fact able to achieve an average improvement of 3.20% and up to 11.7% for some deepfake attacks with respect to standard RGB architectures over the FaceForensic++ dataset.

DepthFake: a depth-based strategy for detecting Deepfake videos

TL;DR

Abstract

Paper Structure (13 sections, 1 equation, 3 figures, 3 tables)

This paper contains 13 sections, 1 equation, 3 figures, 3 tables.

Introduction
Related Works
Deepfake Detection
Monocular Depth Estimation
Proposed Method
Depth Estimation
Deepfake Detection
Implementation Details
Results
Deepfake detection
Preliminary studies on inference time
Conclusions and future work
Acknowledgments

Figures (3)

Figure 1: Some example inconsistencies introduced in the depth map of manipulated faces. Deepfake faces tend to have less details than the original ones.
Figure 2: Pipeline of the proposed method. In the fist step, we estimate the depth for each frame. Then, we extract the face and crop the frame and depth map around the face. In the last step, we train a classifier on RGBD input features.
Figure 3: Accuracy on Deepfake (DF), Face2Face (F2F), FaceSwap (FS), NeuralTexture (NT) and all classes in the dataset (FULL) with RGB, RGBD, Gray and GrayD inputs.

DepthFake: a depth-based strategy for detecting Deepfake videos

TL;DR

Abstract

DepthFake: a depth-based strategy for detecting Deepfake videos

Authors

TL;DR

Abstract

Table of Contents

Figures (3)