Table of Contents
Fetching ...

Deepfake Detection without Deepfakes: Generalization via Synthetic Frequency Patterns Injection

Davide Alessandro Coccomini, Roberto Caldelli, Claudio Gennaro, Giuseppe Fiameni, Giuseppe Amato, Fabrizio Falchi

TL;DR

A learning approach aimed at significantly enhancing the generalization capabilities of deepfake detectors by using only pristine images injecting in part of them crafted frequency patterns, simulating the effects of various deepfake generation techniques without being specific to any.

Abstract

Deepfake detectors are typically trained on large sets of pristine and generated images, resulting in limited generalization capacity; they excel at identifying deepfakes created through methods encountered during training but struggle with those generated by unknown techniques. This paper introduces a learning approach aimed at significantly enhancing the generalization capabilities of deepfake detectors. Our method takes inspiration from the unique "fingerprints" that image generation processes consistently introduce into the frequency domain. These fingerprints manifest as structured and distinctly recognizable frequency patterns. We propose to train detectors using only pristine images injecting in part of them crafted frequency patterns, simulating the effects of various deepfake generation techniques without being specific to any. These synthetic patterns are based on generic shapes, grids, or auras. We evaluated our approach using diverse architectures across 25 different generation methods. The models trained with our approach were able to perform state-of-the-art deepfake detection, demonstrating also superior generalization capabilities in comparison with previous methods. Indeed, they are untied to any specific generation technique and can effectively identify deepfakes regardless of how they were made.

Deepfake Detection without Deepfakes: Generalization via Synthetic Frequency Patterns Injection

TL;DR

A learning approach aimed at significantly enhancing the generalization capabilities of deepfake detectors by using only pristine images injecting in part of them crafted frequency patterns, simulating the effects of various deepfake generation techniques without being specific to any.

Abstract

Deepfake detectors are typically trained on large sets of pristine and generated images, resulting in limited generalization capacity; they excel at identifying deepfakes created through methods encountered during training but struggle with those generated by unknown techniques. This paper introduces a learning approach aimed at significantly enhancing the generalization capabilities of deepfake detectors. Our method takes inspiration from the unique "fingerprints" that image generation processes consistently introduce into the frequency domain. These fingerprints manifest as structured and distinctly recognizable frequency patterns. We propose to train detectors using only pristine images injecting in part of them crafted frequency patterns, simulating the effects of various deepfake generation techniques without being specific to any. These synthetic patterns are based on generic shapes, grids, or auras. We evaluated our approach using diverse architectures across 25 different generation methods. The models trained with our approach were able to perform state-of-the-art deepfake detection, demonstrating also superior generalization capabilities in comparison with previous methods. Indeed, they are untied to any specific generation technique and can effectively identify deepfakes regardless of how they were made.
Paper Structure (19 sections, 6 equations, 9 figures, 2 tables)

This paper contains 19 sections, 6 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: The fingerprints extraction process presented in uninagans is shown in the figure. Considering a set of images obtained from the same generator, they undergo a denoising process and then are transformed to the frequency domain by Fourier Transform. The square of the Fourier Transform is averaged to obtain a unique generator's fingerprint.
  • Figure 2: Overview of synthetically generated patterns. For each category of pattern, the first row indicates the pattern generated in the spatial domain, the second is the magnitude of its Fourier Transform, the third row shows the pristine image on which the pattern has been applied, and the last row shows its Fourier Transform.
  • Figure 3: The process of applying a pattern to an image is shown in the figure. The Fourier Transforms of both are first calculated. From them, we derive the magnitude and phase of the image and pattern. The pattern's magnitude is added to the magnitude of each image channel. Combining these components, we obtain the image with the pattern applied.
  • Figure 4: The proposed training procedure is shown in the figure. During the training data loading phase, each image in the considered batch randomly undergoes pattern injection. If this occurs, the image is considered fake, otherwise it remains pristine.
  • Figure 5: Statistics of the constructed test set about the number of images per generator method with their subjects (a), percentage of pristine and fake images (b) and number of generators per datasets' authors (c).
  • ...and 4 more figures