Table of Contents
Fetching ...

Deepfake detection in videos with multiple faces using geometric-fakeness features

Kirill Vyshegorodtsev, Dmitry Kudiyarov, Alexander Balashov, Alexander Kuzmin

TL;DR

This research proposes to use geometric-fakeness features (GFF) that characterize a dynamic degree of a face presence in a video and its per-frame deepfake scores and trains a complex deep learning model that outputs a final deepfake prediction.

Abstract

Due to the development of facial manipulation techniques in recent years deepfake detection in video stream became an important problem for face biometrics, brand monitoring or online video conferencing solutions. In case of a biometric authentication, if you replace a real datastream with a deepfake, you can bypass a liveness detection system. Using a deepfake in a video conference, you can penetrate into a private meeting. Deepfakes of victims or public figures can also be used by fraudsters for blackmailing, extorsion and financial fraud. Therefore, the task of detecting deepfakes is relevant to ensuring privacy and security. In existing approaches to a deepfake detection their performance deteriorates when multiple faces are present in a video simultaneously or when there are other objects erroneously classified as faces. In our research we propose to use geometric-fakeness features (GFF) that characterize a dynamic degree of a face presence in a video and its per-frame deepfake scores. To analyze temporal inconsistencies in GFFs between the frames we train a complex deep learning model that outputs a final deepfake prediction. We employ our approach to analyze videos with multiple faces that are simultaneously present in a video. Such videos often occur in practice e.g., in an online video conference. In this case, real faces appearing in a frame together with a deepfake face will significantly affect a deepfake detection and our approach allows to counter this problem. Through extensive experiments we demonstrate that our approach outperforms current state-of-the-art methods on popular benchmark datasets such as FaceForensics++, DFDC, Celeb-DF and WildDeepFake. The proposed approach remains accurate when trained to detect multiple different deepfake generation techniques.

Deepfake detection in videos with multiple faces using geometric-fakeness features

TL;DR

This research proposes to use geometric-fakeness features (GFF) that characterize a dynamic degree of a face presence in a video and its per-frame deepfake scores and trains a complex deep learning model that outputs a final deepfake prediction.

Abstract

Due to the development of facial manipulation techniques in recent years deepfake detection in video stream became an important problem for face biometrics, brand monitoring or online video conferencing solutions. In case of a biometric authentication, if you replace a real datastream with a deepfake, you can bypass a liveness detection system. Using a deepfake in a video conference, you can penetrate into a private meeting. Deepfakes of victims or public figures can also be used by fraudsters for blackmailing, extorsion and financial fraud. Therefore, the task of detecting deepfakes is relevant to ensuring privacy and security. In existing approaches to a deepfake detection their performance deteriorates when multiple faces are present in a video simultaneously or when there are other objects erroneously classified as faces. In our research we propose to use geometric-fakeness features (GFF) that characterize a dynamic degree of a face presence in a video and its per-frame deepfake scores. To analyze temporal inconsistencies in GFFs between the frames we train a complex deep learning model that outputs a final deepfake prediction. We employ our approach to analyze videos with multiple faces that are simultaneously present in a video. Such videos often occur in practice e.g., in an online video conference. In this case, real faces appearing in a frame together with a deepfake face will significantly affect a deepfake detection and our approach allows to counter this problem. Through extensive experiments we demonstrate that our approach outperforms current state-of-the-art methods on popular benchmark datasets such as FaceForensics++, DFDC, Celeb-DF and WildDeepFake. The proposed approach remains accurate when trained to detect multiple different deepfake generation techniques.

Paper Structure

This paper contains 10 sections, 2 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Example of false positives of the model on background faces [45].
  • Figure 2: Examples of the presence of real faces and deepfakes. Averaging the scores for all the faces in the video computed a low score. \ref{['fig:Fig_2_zoom']} Deepfake in meeting in Zoom [3]; \ref{['fig:Fig_2']} Meeting at the conference (video from DFD dataset [16])
  • Figure 3: An example of the distribution of faces in a video.
  • Figure 4: Geometric-fakeness features for all faces in a video. For each face the first column (marked green) represents a sequence of per-frame geometric features (i.e. a relative area of a frame occupied by this face) and the second column (marked blue) represents a sequence of fakeness features (a detached prediction of the model that this face is a deepfake).
  • Figure 5: Architecture for computing geometric-fakeness features of faces in a video. Faces on a video are detected, embeddings of faces are computed to group the faces on different frames by person. To obtain fakeness features the faces are fed into DNNBlocks. Geometric features are calculated using the provided formula and are concatenated with fakeness features to form a GFF matrix.
  • ...and 1 more figures