Table of Contents
Fetching ...

Image Copy-Move Forgery Detection via Deep PatchMatch and Pairwise Ranking Learning

Yuanman Li, Yingjie He, Changsheng Chen, Li Dong, Bin Li, Jiantao Zhou, Xia Li

TL;DR

Addressing the generalization gap in CMFD where copied regions may be absent or blend with backgrounds, the paper proposes D2PRL, an end-to-end framework that couples dense-field matching via a differentiable cross-scale PatchMatch (PM) with pairwise ranking learning (PRL) for source/target discrimination. The DFM branch operates on high-resolution features and cross-scale scores $S^{k}$ to produce offset maps $\boldsymbol{\delta}$, while the STD branch warps features by these offsets, concatenates them with the original features, and applies a pairwise ranking loss to distinguish source from target. The approach includes multi-scale dense linear fitting with three diameters $\rho \in \{7,9,11\}$, a Dice-based localization loss, and a margin-based discrimination loss with a weight matrix $W$, yielding an end-to-end trainable system that achieves state-of-the-art results on synthetic, CASIA CMFD, CMH, and CoMoFoD datasets. The method reduces reliance on object-centric features, improves robustness to rotations and scaling, and demonstrates strong generalizability to background copy-move cases, with implications for multimedia security and forensic analysis.

Abstract

Recent advances in deep learning algorithms have shown impressive progress in image copy-move forgery detection (CMFD). However, these algorithms lack generalizability in practical scenarios where the copied regions are not present in the training images, or the cloned regions are part of the background. Additionally, these algorithms utilize convolution operations to distinguish source and target regions, leading to unsatisfactory results when the target regions blend well with the background. To address these limitations, this study proposes a novel end-to-end CMFD framework that integrates the strengths of conventional and deep learning methods. Specifically, the study develops a deep cross-scale PatchMatch (PM) method that is customized for CMFD to locate copy-move regions. Unlike existing deep models, our approach utilizes features extracted from high-resolution scales to seek explicit and reliable point-to-point matching between source and target regions. Furthermore, we propose a novel pairwise rank learning framework to separate source and target regions. By leveraging the strong prior of point-to-point matches, the framework can identify subtle differences and effectively discriminate between source and target regions, even when the target regions blend well with the background. Our framework is fully differentiable and can be trained end-to-end. Comprehensive experimental results highlight the remarkable generalizability of our scheme across various copy-move scenarios, significantly outperforming existing methods.

Image Copy-Move Forgery Detection via Deep PatchMatch and Pairwise Ranking Learning

TL;DR

Addressing the generalization gap in CMFD where copied regions may be absent or blend with backgrounds, the paper proposes D2PRL, an end-to-end framework that couples dense-field matching via a differentiable cross-scale PatchMatch (PM) with pairwise ranking learning (PRL) for source/target discrimination. The DFM branch operates on high-resolution features and cross-scale scores to produce offset maps , while the STD branch warps features by these offsets, concatenates them with the original features, and applies a pairwise ranking loss to distinguish source from target. The approach includes multi-scale dense linear fitting with three diameters , a Dice-based localization loss, and a margin-based discrimination loss with a weight matrix , yielding an end-to-end trainable system that achieves state-of-the-art results on synthetic, CASIA CMFD, CMH, and CoMoFoD datasets. The method reduces reliance on object-centric features, improves robustness to rotations and scaling, and demonstrates strong generalizability to background copy-move cases, with implications for multimedia security and forensic analysis.

Abstract

Recent advances in deep learning algorithms have shown impressive progress in image copy-move forgery detection (CMFD). However, these algorithms lack generalizability in practical scenarios where the copied regions are not present in the training images, or the cloned regions are part of the background. Additionally, these algorithms utilize convolution operations to distinguish source and target regions, leading to unsatisfactory results when the target regions blend well with the background. To address these limitations, this study proposes a novel end-to-end CMFD framework that integrates the strengths of conventional and deep learning methods. Specifically, the study develops a deep cross-scale PatchMatch (PM) method that is customized for CMFD to locate copy-move regions. Unlike existing deep models, our approach utilizes features extracted from high-resolution scales to seek explicit and reliable point-to-point matching between source and target regions. Furthermore, we propose a novel pairwise rank learning framework to separate source and target regions. By leveraging the strong prior of point-to-point matches, the framework can identify subtle differences and effectively discriminate between source and target regions, even when the target regions blend well with the background. Our framework is fully differentiable and can be trained end-to-end. Comprehensive experimental results highlight the remarkable generalizability of our scheme across various copy-move scenarios, significantly outperforming existing methods.
Paper Structure (32 sections, 40 equations, 14 figures, 9 tables)

This paper contains 32 sections, 40 equations, 14 figures, 9 tables.

Figures (14)

  • Figure 1: The illustration of our framework. The top branch is used to localize copy-move regions via Deep Cross-Scale PM, and the bottom branch is used to differentiate the source and target regions via Pairwise Ranking Learning.
  • Figure 2: The framework of our deep cross-scale PM. The Initialization layer first generates a valid offset for each pixel. Then, the Propagation layer uses propagation and random search to generate $K$ candidate offsets for each pixel. In the Evaluation layer, for each offset, the optimal matching score is calculated across different scale feature maps; the best offset is chosen from the $K$ candidate offsets for each pixel (indicated by solid arrows), and the corresponding matching score is saved. Finally, the offset with the highest matching score is then passed back to the Propagation layer.
  • Figure 3: Pixels used for propagation. Different letters mark the relative coordinates centered around $(i, j)$. Green and blue pixels propagate their offsets to pixel $(i, j)$. Blue pixels propagate directly, while green pixels use first-order predictors for propagation.
  • Figure 4: Visualization examples demonstrating the propagation of circular shift displacement to generate candidate offset maps $\boldsymbol{\delta}^a$ and $\boldsymbol{\delta}^e$ based on $\boldsymbol{\delta}$, where each different color represents that the offsets belong to the same column rather than having the same offset.
  • Figure 5: Visualization of offset maps and DLF maps. (a) copy-move images; (b) offset maps $\boldsymbol{\delta}_1$ ($x$ coordinate); (c) offset maps $\boldsymbol{\delta}_1$ ($y$ coordinate); (d) DLF maps with diameter $\rho = 7$.
  • ...and 9 more figures