Table of Contents
Fetching ...

Exposing Image Splicing Traces in Scientific Publications via Uncertainty-guided Refinement

Xun Lin, Wenzhong Tang, Haoran Wang, Yizhong Liu, Yakun Ju, Shuai Wang, Zitong Yu

TL;DR

This work tackles the challenge of detecting splicing traces in scientific images, where disruptive factors and limited annotated data hinder robust detection. It introduces an Uncertainty-guided Refinement Network (URN) that uses Monte Carlo Dropout to identify uncertain regions (stage 1) and uncertainty-guided refinement (stage 2), complemented by Uncertainty-Guided Graph Convolution (UGGC) and Uncertainty-Enhanced Manipulation Attention (UEMA) to robustly propagate and refine information. The authors also present SciSp, a new dataset with SciSp-C (PubPeer-collected spliced images) and SciSp-H (handcrafted spliced images), expanding diversity and realism beyond prior datasets. Comprehensive experiments demonstrate superior accuracy and robustness across degradation and inpainting attacks, cross-dataset generalization, and domain shifts, highlighting URN’s practical value for preserving scientific integrity. The work provides a scalable framework for uncertainty-aware forensic detection and a valuable benchmark resource for future research in scientific image forensics.

Abstract

Recently, a surge in scientific publications suspected of image manipulation has led to numerous retractions, bringing the issue of image integrity into sharp focus. Although research on forensic detectors for image plagiarism and image synthesis exists, the detection of image splicing traces in scientific publications remains unexplored. Compared to image duplication and synthesis, image splicing detection is more challenging due to the lack of reference images and the typically small tampered areas. Furthermore, disruptive factors in scientific images, such as artifacts from digital compression, abnormal patterns, and noise from physical operations, present misleading features like splicing traces, significantly increasing the difficulty of this task. Moreover, the scarcity of high-quality datasets of spliced scientific images limits potential advancements. In this work, we propose an Uncertainty-guided Refinement Network (URN) to mitigate the impact of these disruptive factors. Our URN can explicitly suppress the propagation of unreliable information flow caused by disruptive factors between regions, thus obtaining robust splicing features. Additionally, the URN is designed to concentrate improvements in uncertain prediction areas during the decoding phase. We also construct a dataset for image splicing detection (SciSp) containing 1,290 spliced images. Compared to existing datasets, SciSp includes the largest number of spliced images and the most diverse sources. Comprehensive experiments conducted on three benchmark datasets demonstrate the superiority of our approach. We also validate the URN's generalisability in resisting cross-dataset domain shifts and its robustness against various post-processing techniques, including advanced deep-learning-based inpainting.

Exposing Image Splicing Traces in Scientific Publications via Uncertainty-guided Refinement

TL;DR

This work tackles the challenge of detecting splicing traces in scientific images, where disruptive factors and limited annotated data hinder robust detection. It introduces an Uncertainty-guided Refinement Network (URN) that uses Monte Carlo Dropout to identify uncertain regions (stage 1) and uncertainty-guided refinement (stage 2), complemented by Uncertainty-Guided Graph Convolution (UGGC) and Uncertainty-Enhanced Manipulation Attention (UEMA) to robustly propagate and refine information. The authors also present SciSp, a new dataset with SciSp-C (PubPeer-collected spliced images) and SciSp-H (handcrafted spliced images), expanding diversity and realism beyond prior datasets. Comprehensive experiments demonstrate superior accuracy and robustness across degradation and inpainting attacks, cross-dataset generalization, and domain shifts, highlighting URN’s practical value for preserving scientific integrity. The work provides a scalable framework for uncertainty-aware forensic detection and a valuable benchmark resource for future research in scientific image forensics.

Abstract

Recently, a surge in scientific publications suspected of image manipulation has led to numerous retractions, bringing the issue of image integrity into sharp focus. Although research on forensic detectors for image plagiarism and image synthesis exists, the detection of image splicing traces in scientific publications remains unexplored. Compared to image duplication and synthesis, image splicing detection is more challenging due to the lack of reference images and the typically small tampered areas. Furthermore, disruptive factors in scientific images, such as artifacts from digital compression, abnormal patterns, and noise from physical operations, present misleading features like splicing traces, significantly increasing the difficulty of this task. Moreover, the scarcity of high-quality datasets of spliced scientific images limits potential advancements. In this work, we propose an Uncertainty-guided Refinement Network (URN) to mitigate the impact of these disruptive factors. Our URN can explicitly suppress the propagation of unreliable information flow caused by disruptive factors between regions, thus obtaining robust splicing features. Additionally, the URN is designed to concentrate improvements in uncertain prediction areas during the decoding phase. We also construct a dataset for image splicing detection (SciSp) containing 1,290 spliced images. Compared to existing datasets, SciSp includes the largest number of spliced images and the most diverse sources. Comprehensive experiments conducted on three benchmark datasets demonstrate the superiority of our approach. We also validate the URN's generalisability in resisting cross-dataset domain shifts and its robustness against various post-processing techniques, including advanced deep-learning-based inpainting.
Paper Structure (9 sections, 9 equations, 11 figures, 5 tables)

This paper contains 9 sections, 9 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: (a) Difference between spliced scientific (top two rows) and natural image (bottom one row). Examples of splicing detection results on the two kinds of images predicted by our method and a SoTA natural image manipulation detection method, i.e. MVSS mvss. (b) Disruptive factors such as artifacts, abnormal patterns, and noises in scientific images. The green arrows indicate regions interfered with disruptive factors where the SoTA manipulation detectors designed for natural images tend to give false alarms.
  • Figure 2: Overall structure of the proposed URN. It consists of two stages: (a) the stage-1 network makes coarse predictions of spliced regions and estimates the pixel-level uncertainty about coarse predictions, and (b) the stage-2 uncertainty-guided refinement network can refine coarse predictions under the guidance of uncertainty information.
  • Figure 3: Qualitative results across Biofors, RSIID, SciSp-C, and SciSp-H. From left to right, each column represents images, ground truths (GT), and predictions of competing methods, respectively.
  • Figure 4: Robustness analysis results on SciSp-C against four different degradations, i.e., (a) Gaussian blur, (b) Gaussian noise, (c) JPEG compression, and (d) recaptured by Windows Snipping Tool.
  • Figure 5: Example of using the advanced AI-based inpainting scheme, i.e. LaMa lama, to decrease splicing traces in the first two panels. The arrows indicate the splicing traces. The source image is from Biofors.
  • ...and 6 more figures