Real-Time Deepfake Detection in the Real-World

Bar Cavia; Eliahu Horwitz; Tal Reiss; Yedid Hoshen

Real-Time Deepfake Detection in the Real-World

Bar Cavia, Eliahu Horwitz, Tal Reiss, Yedid Hoshen

TL;DR

This work presents LaDeDa, a patch-based deepfake detector that scores individual $q\times q$ patches (with a $9\times9$ receptive field) and pools them to obtain an image-level prediction, achieving near-SOTA performance on standard benchmarks. To enable practical deployment, the authors distill LaDeDa into Tiny-LaDeDa, a four-layer, edge-friendly model that preserves most accuracy while dramatically reducing FLOPs and parameter count. They also argue that prevailing simulated evaluation protocols do not reflect real-world performance and introduce WildRF, a real-world social-media–sourced deepfake dataset showing a substantial gap to perfect accuracy and highlighting generalization challenges. Across both simulated and real-world benchmarks, LaDeDa/Tiny-LaDeDa demonstrate strong local-artifact detection, with WildRF revealing persistent generalization gaps and JPEG robustness analyses underscoring the need for realistic benchmarks. The work advocates using WildRF for future evaluation and demonstrates that compact, efficient models can achieve practical real-time performance, while also acknowledging ongoing challenges in reliable real-world deepfake detection.

Abstract

Recent improvements in generative AI made synthesizing fake images easy; as they can be used to cause harm, it is crucial to develop accurate techniques to identify them. This paper introduces "Locally Aware Deepfake Detection Algorithm" (LaDeDa), that accepts a single 9x9 image patch and outputs its deepfake score. The image deepfake score is the pooled score of its patches. With merely patch-level information, LaDeDa significantly improves over the state-of-the-art, achieving around 99% mAP on current benchmarks. Owing to the patch-level structure of LaDeDa, we hypothesize that the generation artifacts can be detected by a simple model. We therefore distill LaDeDa into Tiny-LaDeDa, a highly efficient model consisting of only 4 convolutional layers. Remarkably, Tiny-LaDeDa has 375x fewer FLOPs and is 10,000x more parameter-efficient than LaDeDa, allowing it to run efficiently on edge devices with a minor decrease in accuracy. These almost-perfect scores raise the question: is the task of deepfake detection close to being solved? Perhaps surprisingly, our investigation reveals that current training protocols prevent methods from generalizing to real-world deepfakes extracted from social media. To address this issue, we introduce WildRF, a new deepfake detection dataset curated from several popular social networks. Our method achieves the top performance of 93.7% mAP on WildRF, however the large gap from perfect accuracy shows that reliable real-world deepfake detection is still unsolved.

Real-Time Deepfake Detection in the Real-World

TL;DR

This work presents LaDeDa, a patch-based deepfake detector that scores individual

patches (with a

receptive field) and pools them to obtain an image-level prediction, achieving near-SOTA performance on standard benchmarks. To enable practical deployment, the authors distill LaDeDa into Tiny-LaDeDa, a four-layer, edge-friendly model that preserves most accuracy while dramatically reducing FLOPs and parameter count. They also argue that prevailing simulated evaluation protocols do not reflect real-world performance and introduce WildRF, a real-world social-media–sourced deepfake dataset showing a substantial gap to perfect accuracy and highlighting generalization challenges. Across both simulated and real-world benchmarks, LaDeDa/Tiny-LaDeDa demonstrate strong local-artifact detection, with WildRF revealing persistent generalization gaps and JPEG robustness analyses underscoring the need for realistic benchmarks. The work advocates using WildRF for future evaluation and demonstrates that compact, efficient models can achieve practical real-time performance, while also acknowledging ongoing challenges in reliable real-world deepfake detection.

Abstract

Paper Structure (47 sections, 2 equations, 8 figures, 7 tables)

This paper contains 47 sections, 2 equations, 8 figures, 7 tables.

Introduction
Related work
Deepfake detection by supervised learning.
Artifact-based detection methods
Method
LaDeDa: Locally Aware Deepfake Detection Algorithm
Relation to PatchFor philip.
Tiny-LaDeDa
Is the task of deepfake detection close to being solved?
Current: simulated deepfake detection protocol.
The simulated protocol is suboptimal.
WildRF: Aligning deepfake evaluation with the real-world.
Experiments
LaDeDa performance under the current (simulated) protocol
Poor generalization to real-world data.
...and 32 more sections

Figures (8)

Figure 1: Performance vs. efficiency trade-off. Baselines comparison of average precision performance on real-world data as a function of floating-point operations per second (FLOPs) at inference time.
Figure 2: LaDeDa Training. By limiting its receptive field to $q \times q$ pixels, LaDeDa yields a deepfake score for each $q \times q$ patch. The image-level deepfake score is the global pooling of the patches scores. We use binary cross entropy loss between the image label and its deepfake score.
Figure 3: Tiny-LaDeDa Distillation. Pre-trained LaDeDa (teacher) transfers patch-level deepfake score knowledge to train Tiny-LaDeDa (student).
Figure 4: WildRF Overview. A realistic benchmark consisting of images sourced from popular social platforms: Reddit, X (Twitter) and Facebook. WildRF contains high variability in a range of attributes including image resolutions, formats, semantic content, and transformations encountered in-the-wild.
Figure 5: (a) Local and global deepfake scores. We show average precision (AP) performance on WildRF, when ensemble LaDeDa deepfake scores with CLIP CLIP deepfake scores. (b) JPEG robustness. We show LaDeDa average precision (AP) performance on facebook test set, as a function of JPEG compression quality from 100 (no compression) to 30.
...and 3 more figures

Real-Time Deepfake Detection in the Real-World

TL;DR

Abstract

Real-Time Deepfake Detection in the Real-World

Authors

TL;DR

Abstract

Table of Contents

Figures (8)