Fine-Grained DINO Tuning with Dual Supervision for Face Forgery Detection

Tianxiang Zhang; Peipeng Yu; Zhihua Xia; Longchen Dai; Xiaoyu Zhou; Hui Gao

Fine-Grained DINO Tuning with Dual Supervision for Face Forgery Detection

Tianxiang Zhang, Peipeng Yu, Zhihua Xia, Longchen Dai, Xiaoyu Zhou, Hui Gao

TL;DR

This work tackles the generalization limitations of current deepfake detectors by tuning DINOv2 with a DeepFake Fine-Grained Adapter (DFF-Adapter) that injects task-specific and shared low-rank adapters across all Transformer blocks. It introduces a Forgery-Aware Multi-Head Router to route subspace features to specialized LoRA experts and a Shared-Enhanced Task Fusion module to transfer fine-grained forgery cues to the authenticity task, all while keeping the backbone frozen. The approach achieves state-of-the-art or competitive results on intra- and cross-dataset benchmarks and across cross-manipulation scenarios, using only a small number of trainable parameters. This demonstrates strong generalization and practical potential for robust, efficient face forgery detection in real-world security settings.

Abstract

The proliferation of sophisticated deepfakes poses significant threats to information integrity. While DINOv2 shows promise for detection, existing fine-tuning approaches treat it as generic binary classification, overlooking distinct artifacts inherent to different deepfake methods. To address this, we propose a DeepFake Fine-Grained Adapter (DFF-Adapter) for DINOv2. Our method incorporates lightweight multi-head LoRA modules into every transformer block, enabling efficient backbone adaptation. DFF-Adapter simultaneously addresses authenticity detection and fine-grained manipulation type classification, where classifying forgery methods enhances artifact sensitivity. We introduce a shared branch propagating fine-grained manipulation cues to the authenticity head. This enables multi-task cooperative optimization, explicitly enhancing authenticity discrimination with manipulation-specific knowledge. Utilizing only 3.5M trainable parameters, our parameter-efficient approach achieves detection accuracy comparable to or even surpassing that of current complex state-of-the-art methods.

Fine-Grained DINO Tuning with Dual Supervision for Face Forgery Detection

TL;DR

Abstract

Fine-Grained DINO Tuning with Dual Supervision for Face Forgery Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)