Table of Contents
Fetching ...

Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning

Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, Yunchao Wei

TL;DR

FreqNet tackles the challenge of generalizable deepfake detection under limited training data by injecting frequency-domain learning into a lightweight CNN. It introduces two plugins—a High-Frequency Representation and a Frequency Convolutional Layer—to force learning in the frequency space and learn source-agnostic features, achieving state-of-the-art generalization across 17 GAN models with only 1.9M parameters. The approach yields substantial gains over prior methods, including a +9.8% improvement in mean accuracy on real-world scenes and strong face-image performance, while requiring far fewer parameters than large baselines. This work demonstrates that targeted frequency-space learning can significantly improve robustness to unseen synthesis models, with practical implications for scalable, real-world deepfake detection.

Abstract

This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images despite limited training data. Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries. However, the rapid advancements in synthesis technology have led to specific artifacts for each generation model. Consequently, these detectors have exhibited a lack of proficiency in learning the frequency domain and tend to overfit to the artifacts present in the training data, leading to suboptimal performance on unseen sources. To address this issue, we introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors. Our method forces the detector to continuously focus on high-frequency information, exploiting high-frequency representation of features across spatial and channel dimensions. Additionally, we incorporate a straightforward frequency domain learning module to learn source-agnostic features. It involves convolutional layers applied to both the phase spectrum and amplitude spectrum between the Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (iFFT). Extensive experimentation involving 17 GANs demonstrates the effectiveness of our proposed method, showcasing state-of-the-art performance (+9.8\%) while requiring fewer parameters. The code is available at {\cred \url{https://github.com/chuangchuangtan/FreqNet-DeepfakeDetection}}.

Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning

TL;DR

FreqNet tackles the challenge of generalizable deepfake detection under limited training data by injecting frequency-domain learning into a lightweight CNN. It introduces two plugins—a High-Frequency Representation and a Frequency Convolutional Layer—to force learning in the frequency space and learn source-agnostic features, achieving state-of-the-art generalization across 17 GAN models with only 1.9M parameters. The approach yields substantial gains over prior methods, including a +9.8% improvement in mean accuracy on real-world scenes and strong face-image performance, while requiring far fewer parameters than large baselines. This work demonstrates that targeted frequency-space learning can significantly improve robustness to unseen synthesis models, with practical implications for scalable, real-world deepfake detection.

Abstract

This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images despite limited training data. Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries. However, the rapid advancements in synthesis technology have led to specific artifacts for each generation model. Consequently, these detectors have exhibited a lack of proficiency in learning the frequency domain and tend to overfit to the artifacts present in the training data, leading to suboptimal performance on unseen sources. To address this issue, we introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors. Our method forces the detector to continuously focus on high-frequency information, exploiting high-frequency representation of features across spatial and channel dimensions. Additionally, we incorporate a straightforward frequency domain learning module to learn source-agnostic features. It involves convolutional layers applied to both the phase spectrum and amplitude spectrum between the Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (iFFT). Extensive experimentation involving 17 GANs demonstrates the effectiveness of our proposed method, showcasing state-of-the-art performance (+9.8\%) while requiring fewer parameters. The code is available at {\cred \url{https://github.com/chuangchuangtan/FreqNet-DeepfakeDetection}}.
Paper Structure (18 sections, 8 equations, 4 figures, 4 tables)

This paper contains 18 sections, 8 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Frequency space learning network. (a) The traditional studies are usually limited to developing frequency-level artifacts. (b) Distinguishing itself from prior frequency-based research, our approach shifts its focus to the frequency-related attributes of the features within the detector. This novel perspective includes a continuous emphasis on high-frequency details within the classifier, which capitalizes on the enriched depiction of high-frequency feature map components spanning both spatial and channel dimensions. Additionally, our strategy introduces a trainable layer embedded within the frequency domain, facilitating the acquisition of source-agnostic features.
  • Figure 2: Frequency analysis on various sources. This mean FFT spectrum computation involves averaging over 2,000 images, following the methodology detailed in Frank.
  • Figure 3: Architecture of FreqNet for generalizable deepfake detection. To augment the capacity for generalization, our FreqNet focuses on the enhancement of frequency spectrum information, prioritizing frequency domain learning within the classifier, consisting of (a) High-Frequency Representation of Image(HFRI), (b) High-Frequency Representation of Feature(HFRF), and (c) Frequency Conv Layer(FCL).
  • Figure 4: The visualization of Class Activate Map (CAM) zhou2016learning extracted from detector on face images.