Exploring Multi-view Pixel Contrast for General and Robust Image Forgery Localization
Zijie Lou, Gang Cao, Kun Guo, Haochen Zhu, Lifang Yu
TL;DR
This work tackles image forgery localization by addressing the poorly structured feature space in pixel embeddings. It introduces Multi-view Pixel-wise Contrastive (MPC) learning, which pre-trains a high-resolution backbone using supervised contrastive loss from within-image, cross-scale, and cross-modality perspectives, followed by fine-tuning a localization head with cross-entropy. The approach yields a well-organized pixel feature space, improving intra-class compactness and inter-class separability, and demonstrates superior generalization across diverse datasets and robustness to complex post-processing, including online social network transformations. MPC achieves state-of-the-art or competitive performance with a lightweight model and shows strong qualitative results on traditional tampering and AI-generated manipulations. The method promises practical forensic utility in real-world scenarios where forgeries vary in scale and post-processing might be encountered.
Abstract
Image forgery localization, which aims to segment tampered regions in an image, is a fundamental yet challenging digital forensic task. While some deep learning-based forensic methods have achieved impressive results, they directly learn pixel-to-label mappings without fully exploiting the relationship between pixels in the feature space. To address such deficiency, we propose a Multi-view Pixel-wise Contrastive algorithm (MPC) for image forgery localization. Specifically, we first pre-train the backbone network with the supervised contrastive loss to model pixel relationships from the perspectives of within-image, cross-scale and cross-modality. That is aimed at increasing intra-class compactness and inter-class separability. Then the localization head is fine-tuned using the cross-entropy loss, resulting in a better pixel localizer. The MPC is trained on three different scale training datasets to make a comprehensive and fair comparison with existing image forgery localization algorithms. Extensive experiments on the small, medium and large scale training datasets show that the proposed MPC achieves higher generalization performance and robustness against post-processing than the state-of-the-arts. Code will be available at https://github.com/multimediaFor/MPC.
