Table of Contents
Fetching ...

Multi-Scale Cross-Fusion and Edge-Supervision Network for Image Splicing Localization

Yakun Niu, Pei Chen, Lei Zhang, Hongjian Yin, Qi Chang

TL;DR

This work tackles image splicing localization by integrating multi-scale cross-fusion of RGB and noise-domain features with edge-aware supervision. It introduces a dual-branch backbone (RGB and NoisePrint++) whose features are fused via Cross-Scale Fusion and Cross-Domain Fusion using CondConv, augmented by an Edge Mask Prediction module and an edge-guided, attention-based localization head. The approach jointly optimizes pixel-level forgery masks and edge masks with dedicated loss terms, and demonstrates state-of-the-art performance on Columbia, CASIA, and NIST16 with strong robustness to perturbations. The combination of scale-aware feature fusion, boundary-focused edge information, and progressive supervision yields improved detection accuracy and boundary integrity for ISL, supporting practical forensics workflows.

Abstract

Image Splicing Localization (ISL) is a fundamental yet challenging task in digital forensics. Although current approaches have achieved promising performance, the edge information is insufficiently exploited, resulting in poor integrality and high false alarms. To tackle this problem, we propose a multi-scale cross-fusion and edge-supervision network for ISL. Specifically, our framework consists of three key steps: multi-scale features cross-fusion, edge mask prediction and edge-supervision localization. Firstly, we input the RGB image and its noise image into a segmentation network to learn multi-scale features, which are then aggregated via a cross-scale fusion followed by a cross-domain fusion to enhance feature representation. Secondly, we design an edge mask prediction module to effectively mine the reliable boundary artifacts. Finally, the cross-fused features and the reliable edge mask information are seamlessly integrated via an attention mechanism to incrementally supervise and facilitate model training. Extensive experiments on publicly available datasets demonstrate that our proposed method is superior to state-of-the-art schemes.

Multi-Scale Cross-Fusion and Edge-Supervision Network for Image Splicing Localization

TL;DR

This work tackles image splicing localization by integrating multi-scale cross-fusion of RGB and noise-domain features with edge-aware supervision. It introduces a dual-branch backbone (RGB and NoisePrint++) whose features are fused via Cross-Scale Fusion and Cross-Domain Fusion using CondConv, augmented by an Edge Mask Prediction module and an edge-guided, attention-based localization head. The approach jointly optimizes pixel-level forgery masks and edge masks with dedicated loss terms, and demonstrates state-of-the-art performance on Columbia, CASIA, and NIST16 with strong robustness to perturbations. The combination of scale-aware feature fusion, boundary-focused edge information, and progressive supervision yields improved detection accuracy and boundary integrity for ISL, supporting practical forensics workflows.

Abstract

Image Splicing Localization (ISL) is a fundamental yet challenging task in digital forensics. Although current approaches have achieved promising performance, the edge information is insufficiently exploited, resulting in poor integrality and high false alarms. To tackle this problem, we propose a multi-scale cross-fusion and edge-supervision network for ISL. Specifically, our framework consists of three key steps: multi-scale features cross-fusion, edge mask prediction and edge-supervision localization. Firstly, we input the RGB image and its noise image into a segmentation network to learn multi-scale features, which are then aggregated via a cross-scale fusion followed by a cross-domain fusion to enhance feature representation. Secondly, we design an edge mask prediction module to effectively mine the reliable boundary artifacts. Finally, the cross-fused features and the reliable edge mask information are seamlessly integrated via an attention mechanism to incrementally supervise and facilitate model training. Extensive experiments on publicly available datasets demonstrate that our proposed method is superior to state-of-the-art schemes.

Paper Structure

This paper contains 17 sections, 5 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The proposed network architecture for splicing forgery localization.
  • Figure 2: Analysis of robustness against image resize and Gaussian noise.
  • Figure 3: Examples of localization for different methods.