Forensics Adapter: Unleashing CLIP for Generalizable Face Forgery Detection

Xinjie Cui; Yuezun Li; Delong Zhu; Jiaran Zhou; Junyu Dong; Siwei Lyu

Forensics Adapter: Unleashing CLIP for Generalizable Face Forgery Detection

Xinjie Cui, Yuezun Li, Delong Zhu, Jiaran Zhou, Junyu Dong, Siwei Lyu

TL;DR

This work introduces Forensics Adapter, a lightweight adapter placed alongside CLIP to learn task-specific forgery traces—blending boundaries—while employing an interaction strategy that guides CLIP toward forgery-relevant knowledge. With only 5.7M trainable parameters, the adapter achieves substantial cross-dataset gains (about 7% AUC on average) across five standard datasets. An extended version, Forensics Adapter++, incorporates a forgery-aware prompt learning scheme to leverage textual modality, yielding an additional ~1.3% improvement. Together, the methods establish a strong, scalable baseline for CLIP-based face forgery detection, demonstrating strong generalization and robust performance across diverse forgery distributions.

Abstract

We describe Forensics Adapter, an adapter network designed to transform CLIP into an effective and generalizable face forgery detector. Although CLIP is highly versatile, adapting it for face forgery detection is non-trivial as forgery-related knowledge is entangled with a wide range of unrelated knowledge. Existing methods treat CLIP merely as a feature extractor, lacking task-specific adaptation, which limits their effectiveness. To address this, we introduce an adapter to learn face forgery traces -- the blending boundaries unique to forged faces, guided by task-specific objectives. Then we enhance the CLIP visual tokens with a dedicated interaction strategy that communicates knowledge across CLIP and the adapter. Since the adapter is alongside CLIP, its versatility is highly retained, naturally ensuring strong generalizability in face forgery detection. With only 5.7M trainable parameters, our method achieves a significant performance boost, improving by approximately 7% on average across five standard datasets. Additionally, we describe Forensics Adapter++, an extended method that incorporates textual modality via a newly proposed forgery-aware prompt learning strategy. This extension leads to a further 1.3% performance boost over the original Forensics Adapter. We believe the proposed methods can serve as a baseline for future CLIP-based face forgery detection methods. The codes have been released at https://github.com/OUC-VAS/ForensicsAdapter.

Forensics Adapter: Unleashing CLIP for Generalizable Face Forgery Detection

TL;DR

Abstract

Forensics Adapter: Unleashing CLIP for Generalizable Face Forgery Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)