Learning to Discover Forgery Cues for Face Forgery Detection
Jiahe Tian, Peng Chen, Cai Yu, Xiaomeng Fu, Xi Wang, Jiao Dai, Jizhong Han
TL;DR
This work tackles the challenge of interpretable, pixel-level forgery cue localization in face forgery detection without relying on paired real-forged data. It introduces Forgery Cue Discovery (FoCus), which uses a Classification Attentive Regions Proposal (CARP) module and a Complementary Learning (CL) framework to generate robust, exploitable manipulation maps from unpaired faces by fusing RGB and Sobel edge cues. Empirically, FoCus improves multi-task detection models across five datasets and demonstrates strong in-dataset and cross-dataset generalization, along with improved interpretability and robustness of the cues. The approach broadens training data scalability for forgery detection and offers a practical path toward more explainable and generalizable detectors.
Abstract
Locating manipulation maps, i.e., pixel-level annotation of forgery cues, is crucial for providing interpretable detection results in face forgery detection. Related learning objects have also been widely adopted as auxiliary tasks to improve the classification performance of detectors whereas they require comparisons between paired real and forged faces to obtain manipulation maps as supervision. This requirement restricts their applicability to unpaired faces and contradicts real-world scenarios. Moreover, the used comparison methods annotate all changed pixels, including noise introduced by compression and upsampling. Using such maps as supervision hinders the learning of exploitable cues and makes models prone to overfitting. To address these issues, we introduce a weakly supervised model in this paper, named Forgery Cue Discovery (FoCus), to locate forgery cues in unpaired faces. Unlike some detectors that claim to locate forged regions in attention maps, FoCus is designed to sidestep their shortcomings of capturing partial and inaccurate forgery cues. Specifically, we propose a classification attentive regions proposal module to locate forgery cues during classification and a complementary learning module to facilitate the learning of richer cues. The produced manipulation maps can serve as better supervision to enhance face forgery detectors. Visualization of the manipulation maps of the proposed FoCus exhibits superior interpretability and robustness compared to existing methods. Experiments on five datasets and four multi-task models demonstrate the effectiveness of FoCus in both in-dataset and cross-dataset evaluations.
