X2-DFD: A framework for eXplainable and eXtendable Deepfake Detection

Yize Chen; Zhiyuan Yan; Guangliang Cheng; Kangran Zhao; Siwei Lyu; Baoyuan Wu

X2-DFD: A framework for eXplainable and eXtendable Deepfake Detection

Yize Chen, Zhiyuan Yan, Guangliang Cheng, Kangran Zhao, Siwei Lyu, Baoyuan Wu

TL;DR

The paper tackles the explainability gap in deepfake detection by introducing X2-DFD, a three-stage framework that first assesses forgery-related features with MFA, then strengthens strong features via SFS while supplementing weak ones with WFS, and finally fine-tunes an MLLM (using LoRA) on a purpose-built, VQA-style dataset to achieve improved detection and explanatory capabilities. By leveraging both MLLMs and Specific Feature Detectors, the approach yields more reliable explanations and robust detection across diverse datasets, demonstrated through extensive cross-dataset and human/GPT-4o explainability evaluations. The work offers a plug‑and‑play, extendable design that can incorporate future MLLMs and detectors, with strong generalization and explainability gains evidenced by comprehensive experiments and ablations. This framework advances practical deepfake defenses by delivering not only higher accuracy but also trustworthy, interpretable justifications for its predictions, which is critical for user trust and adoption in real-world settings.

Abstract

This paper proposes X2-DFD, an eXplainable and eXtendable framework based on multimodal large-language models (MLLMs) for deepfake detection, consisting of three key stages. The first stage, Model Feature Assessment, systematically evaluates the detectability of forgery-related features for the MLLM, generating a prioritized ranking of features based on their intrinsic importance to the model. The second stage, Explainable Dataset Construction, consists of two key modules: Strong Feature Strengthening, which is designed to enhance the model's existing detection and explanation capabilities by reinforcing its well-learned features, and Weak Feature Supplementing, which addresses gaps by integrating specific feature detectors (e.g., low-level artifact analyzers) to compensate for the MLLM's limitations. The third stage, Fine-tuning and Inference, involves fine-tuning the MLLM on the constructed dataset and deploying it for final detection and explanation. By integrating these three stages, our approach enhances the MLLM's strengths while supplementing its weaknesses, ultimately improving both the detectability and explainability. Extensive experiments and ablations, followed by a comprehensive human study, validate the improved performance of our approach compared to the original MLLMs. More encouragingly, our framework is designed to be plug-and-play, allowing it to seamlessly integrate with future more advanced MLLMs and specific feature detectors, leading to continual improvement and extension to face the challenges of rapidly evolving deepfakes.

X2-DFD: A framework for eXplainable and eXtendable Deepfake Detection

TL;DR

Abstract

Paper Structure (24 sections, 3 equations, 3 figures, 6 tables)

This paper contains 24 sections, 3 equations, 3 figures, 6 tables.

Introduction
Related Work
Conventional Deepfake Detection
Deepfake Detection via Multimodal Large Language Model
Method
Model Feature Assessment (MFA)
Strong Feature Strengthening (SFS)
Weak Feature Supplementing (WFS)
Model Finetune and Inference
Experiment
Experimental Setup
Datasets.
Evaluation Metrics.
Implementation Details.
Generalizability Evaluation
...and 9 more sections

Figures (3)

Figure 1: High-level overview of our framework, consisting of three key stages: (1) Model Feature Assessment (MFA) evaluates and ranks the forgery-related features ($e.g.$, blending artifacts) to generate a feature set, (2) Strong Feature Strengthening (SFS) enhances the model's strong features for improvsed detection and explanation, while Weak Feature Supplementing (WFS) leverages Specific Feature Detector (SFD) to compensate the model's weak features, and eventually resulting in an explainable dataset, and (3) The MLLM is fine-tuned using the dataset and then used for inference.
Figure 2: The diagram shows that pretrained models ($e.g.$, LLaVa) effectively distinguish real from fake content using semantic features ($e.g.$, Skin tone, Contour), but perform poorly with signal features ($e.g.$, Blending, Lighting).
Figure 3: A comprehensive breakdown of the three-stage methodology for $\mathcal{X}^2$-DFD. In Stage 1, an automated procedure for forger-related feature generation, evaluation, and ranking is implemented within the MFA (Model Feature Assessment) module. Stage 2 incorporates the SFS (Strong Feature Strengthening) module, which automates the generation of explanatory annotations for a fine-tuning dataset consisting of real and fake images, leveraging strong features, alongside the WFS (Weak Feature Supplementing) module, which employs a specific feature detector to produce explanations for weak features. Stage 3 entails model fine-tuning and inference, empowering the model to excel in detection performance and provide precise explanations, utilizing both its proficient strong features and less proficient weak features for improved detection and explanation.

X2-DFD: A framework for eXplainable and eXtendable Deepfake Detection

TL;DR

Abstract

X2-DFD: A framework for eXplainable and eXtendable Deepfake Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (3)