Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou
TL;DR
Delocate tackles the challenge of detecting and localizing Deepfake videos with tampering traces located at random facial regions across unseen domains. It proposes a two-stage framework: a Recovering for Consistency Learning stage that pretrains a masked autoencoder on real faces with ROI-based masking to learn facial-part consistency, and a Localization for Discrepancy Learning stage that uses meta-learning and an encoder–decoder with a mapping module to detect and localize tampered regions by exploiting reconstruction discrepancies. The method jointly optimizes classification and localization losses under a meta-learning regime to enhance cross-domain generalization, achieving superior cross-domain detection and localization on multiple benchmarks while maintaining strong intra-domain performance. Overall, Delocate provides interpretable localization cues and robust detection for unknown-domain Deepfakes, advancing practical Deepfake forensic capabilities.
Abstract
Deepfake videos are becoming increasingly realistic, showing few tampering traces on facial areasthat vary between frames. Consequently, existing Deepfake detection methods struggle to detect unknown domain Deepfake videos while accurately locating the tampered region. To address thislimitation, we propose Delocate, a novel Deepfake detection model that can both recognize andlocalize unknown domain Deepfake videos. Ourmethod consists of two stages named recoveringand localization. In the recovering stage, the modelrandomly masks regions of interest (ROIs) and reconstructs real faces without tampering traces, leading to a relatively good recovery effect for realfaces and a poor recovery effect for fake faces. Inthe localization stage, the output of the recoveryphase and the forgery ground truth mask serve assupervision to guide the forgery localization process. This process strategically emphasizes the recovery phase of fake faces with poor recovery, facilitating the localization of tampered regions. Ourextensive experiments on four widely used benchmark datasets demonstrate that Delocate not onlyexcels in localizing tampered areas but also enhances cross-domain detection performance.
