Table of Contents
Fetching ...

Tracing Copied Pixels and Regularizing Patch Affinity in Copy Detection

Yichen Lu, Siwei Nie, Minlong Lu, Xudong Yang, Xiaobo Zhang, Peng Zhang

TL;DR

This work proposes PixTrace - a pixel coordinate tracking module that maintains explicit spatial mappings across editing transformations that regularizes patch affinity using overlap ratios derived from PixTrace's verified mappings, and introduces CopyNCE, a geometrically-guided contrastive loss that regularizes patch affinity using overlap ratios derived from PixTrace's verified mappings.

Abstract

Image Copy Detection (ICD) aims to identify manipulated content between image pairs through robust feature representation learning. While self-supervised learning (SSL) has advanced ICD systems, existing view-level contrastive methods struggle with sophisticated edits due to insufficient fine-grained correspondence learning. We address this limitation by exploiting the inherent geometric traceability in edited content through two key innovations. First, we propose PixTrace - a pixel coordinate tracking module that maintains explicit spatial mappings across editing transformations. Second, we introduce CopyNCE, a geometrically-guided contrastive loss that regularizes patch affinity using overlap ratios derived from PixTrace's verified mappings. Our method bridges pixel-level traceability with patch-level similarity learning, suppressing supervision noise in SSL training. Extensive experiments demonstrate not only state-of-the-art performance (88.7% uAP / 83.9% RP90 for matcher, 72.6% uAP / 68.4% RP90 for descriptor on DISC21 dataset) but also better interpretability over existing methods.

Tracing Copied Pixels and Regularizing Patch Affinity in Copy Detection

TL;DR

This work proposes PixTrace - a pixel coordinate tracking module that maintains explicit spatial mappings across editing transformations that regularizes patch affinity using overlap ratios derived from PixTrace's verified mappings, and introduces CopyNCE, a geometrically-guided contrastive loss that regularizes patch affinity using overlap ratios derived from PixTrace's verified mappings.

Abstract

Image Copy Detection (ICD) aims to identify manipulated content between image pairs through robust feature representation learning. While self-supervised learning (SSL) has advanced ICD systems, existing view-level contrastive methods struggle with sophisticated edits due to insufficient fine-grained correspondence learning. We address this limitation by exploiting the inherent geometric traceability in edited content through two key innovations. First, we propose PixTrace - a pixel coordinate tracking module that maintains explicit spatial mappings across editing transformations. Second, we introduce CopyNCE, a geometrically-guided contrastive loss that regularizes patch affinity using overlap ratios derived from PixTrace's verified mappings. Our method bridges pixel-level traceability with patch-level similarity learning, suppressing supervision noise in SSL training. Extensive experiments demonstrate not only state-of-the-art performance (88.7% uAP / 83.9% RP90 for matcher, 72.6% uAP / 68.4% RP90 for descriptor on DISC21 dataset) but also better interpretability over existing methods.
Paper Structure (39 sections, 14 equations, 12 figures, 16 tables, 1 algorithm)

This paper contains 39 sections, 14 equations, 12 figures, 16 tables, 1 algorithm.

Figures (12)

  • Figure 1: Traceability of pixels. Image B copy edits upon image A. The edits includes image matting, affine transform and color jitter. Pixels of the copy region could be tracked back to the original image if specific edit functions are available.
  • Figure 2: a). Overview of PixTrace. Images $\text{I}_a$ and $\text{I}_b$ are both derived from $\text{I}_o$ through copy edits. Coordinate tables $\mathbb{T}_{ao}$ and $\mathbb{T}_{bo}$ describe coordinate correspondence of each pixel in $\text{I}_a$ and $\text{I}_b$ relative to $\text{I}_o$. Furthermore, with $\text{I}_o$ as a bridge, shared pixels in $\text{I}_a$ and $\text{I}_b$ can also be tracked against one another. Notice, edits in this figure are for demonstration purposes, actual edits are significantly more complex. b). Illustration of reverse operation of table $\mathbb{T}$. After reversion, coordinate $(i, j)$ at position $(m, n)$ will be used to place coordinate $(m, n)$.
  • Figure 3: a). Typical noise of heuristic matching methods. Match A: despite the nearest patch on query, LocNN still brings false match without overlapping since the original patch has no counterpart on Query 1; Match B: Neither LocNN nor FeatNN could retrieve all positive patches; Match C: FeatNN could be misled by semantically similar objects. b). Overview of CopyNCE. Patch $\mathcal{R}_i^q$ in query edits upon patches in $\mathcal{R}^r$ and their area proportions in $\mathcal{R}_i^q$ are 12%, 20%, 20% and 48%. c). Matcher Architecture. Matcher consists of encoder and fusion modules. Fusion takes concatenated tokens from encoder as input. Encoders in matcher share weights.
  • Figure 4: Visualization of token affinity and affinity entropy. The left part displays reference and query images. Green box is the probe patch to draw token affinity heatmaps. The subsequent section contains token affinity heatmaps and affinity entropy of different models.
  • Figure 5: mAP v.s.$\mathbf{\mu}$AP. In this case, mAP is 100%, because all positive samples are ranked at the first place in the view of each query. While no threshold could be set to achieve that perfect performance. In $\mu$AP, results are concatenated together to calculate precision of each threshold. Thus, $\mu$AP could better reflect the real performance of model. Mathematically, mAP also serves as an upper bound of $\mu$AP.
  • ...and 7 more figures

Theorems & Definitions (1)

  • Definition 1