X2-DFD: A framework for eXplainable and eXtendable Deepfake Detection
Yize Chen, Zhiyuan Yan, Guangliang Cheng, Kangran Zhao, Siwei Lyu, Baoyuan Wu
TL;DR
The paper tackles the explainability gap in deepfake detection by introducing X2-DFD, a three-stage framework that first assesses forgery-related features with MFA, then strengthens strong features via SFS while supplementing weak ones with WFS, and finally fine-tunes an MLLM (using LoRA) on a purpose-built, VQA-style dataset to achieve improved detection and explanatory capabilities. By leveraging both MLLMs and Specific Feature Detectors, the approach yields more reliable explanations and robust detection across diverse datasets, demonstrated through extensive cross-dataset and human/GPT-4o explainability evaluations. The work offers a plug‑and‑play, extendable design that can incorporate future MLLMs and detectors, with strong generalization and explainability gains evidenced by comprehensive experiments and ablations. This framework advances practical deepfake defenses by delivering not only higher accuracy but also trustworthy, interpretable justifications for its predictions, which is critical for user trust and adoption in real-world settings.
Abstract
This paper proposes X2-DFD, an eXplainable and eXtendable framework based on multimodal large-language models (MLLMs) for deepfake detection, consisting of three key stages. The first stage, Model Feature Assessment, systematically evaluates the detectability of forgery-related features for the MLLM, generating a prioritized ranking of features based on their intrinsic importance to the model. The second stage, Explainable Dataset Construction, consists of two key modules: Strong Feature Strengthening, which is designed to enhance the model's existing detection and explanation capabilities by reinforcing its well-learned features, and Weak Feature Supplementing, which addresses gaps by integrating specific feature detectors (e.g., low-level artifact analyzers) to compensate for the MLLM's limitations. The third stage, Fine-tuning and Inference, involves fine-tuning the MLLM on the constructed dataset and deploying it for final detection and explanation. By integrating these three stages, our approach enhances the MLLM's strengths while supplementing its weaknesses, ultimately improving both the detectability and explainability. Extensive experiments and ablations, followed by a comprehensive human study, validate the improved performance of our approach compared to the original MLLMs. More encouragingly, our framework is designed to be plug-and-play, allowing it to seamlessly integrate with future more advanced MLLMs and specific feature detectors, leading to continual improvement and extension to face the challenges of rapidly evolving deepfakes.
