Rel-Zero: Harnessing Patch-Pair Invariance for Robust Zero-Watermarking Against AI Editing

Pengzhen Chen; Yanwei Liu; Xiaoyan Gu; Xiaojun Chen; Wu Liu; Weiping Wang

Rel-Zero: Harnessing Patch-Pair Invariance for Robust Zero-Watermarking Against AI Editing

Pengzhen Chen, Yanwei Liu, Xiaoyan Gu, Xiaojun Chen, Wu Liu, Weiping Wang

Abstract

Recent advancements in diffusion-based image editing pose a significant threat to the authenticity of digital visual content. Traditional embedding-based watermarking methods often introduce perceptible perturbations to maintain robustness, inevitably compromising visual fidelity. Meanwhile, existing zero-watermarking approaches, typically relying on global image features, struggle to withstand sophisticated manipulations. In this work, we uncover a key observation: while individual image patches undergo substantial alterations during AI-based editing, the relational distance between patch pairs remains relatively invariant. Leveraging this property, we propose Relational Zero-Watermarking (Rel-Zero), a novel framework that requires no modification to the original image but derives a unique zero-watermark from these editing-invariant patch relations. By grounding the watermark in intrinsic structural consistency rather than absolute appearance, Rel-Zero provides a non-invasive yet resilient mechanism for content authentication. Extensive experiments demonstrate that Rel-Zero achieves substantially improved robustness across diverse editing models and manipulations compared to prior zero-watermarking approaches.

Rel-Zero: Harnessing Patch-Pair Invariance for Robust Zero-Watermarking Against AI Editing

Abstract

Paper Structure (31 sections, 18 equations, 10 figures, 4 tables)

This paper contains 31 sections, 18 equations, 10 figures, 4 tables.

Introduction
Related Work
Conventional Embedding Watermarking
Zero-Watermarking
Pairwise Patch Distance Preservation Under Editing
Experimental Discovery
Theoretical Justification
Methodology
Stable Patch Pair Identification
Patch Relational Learning
Training Objective
Watermark Generation and Verification
Generation.
Verification.
Experiments
...and 16 more sections

Figures (10)

Figure 1: Analysis of Relational Stability. We reveal a key insight: Patch-pair distance tends to be preserved after AI editing. This invariant property can be extracted as a zero-watermark. Here we present the patch-pair distance of RGB vectors in original images (first row) and edited images (second row) across five main editing models. The last row illustrates the distance difference between pre-edit and post-edit with a color scale spanning [-0.2,0.2].
Figure 1: PCA visualization of ViT patch embeddings before and after generative editing or VAE reconstruction. Both transformations cause similar displacement patterns: only patches in drastically semantically edited regions move noticeably, while most patches remain stable. The similarity of these trajectories demonstrates that VAE reconstruction effectively mimics edit-induced feature changes.
Figure 2: Analysis of Relational Stability.
Figure 2: Correlation between patch–pair ViT features distance before and after editing. Distance aligns strongly with a fitted linear model $M_{\text{after}} \approx \alpha\, M_{\text{before}}$, confirming that ViT patch–pair relationships remain stable and support our relational watermark extraction.
Figure 3: Framework of the proposed Rel-Zero.(a) Stable Patch Pair Identification. To train a predictor capable of identifying patch pairs with invariant distance relationships, we first construct training targets. A pretrained VAE is employed to simulate generative edits. Features of the original and VAE-modified images are extracted using a ViT to obtain patch-wise features $\mathcal{F}$ and $\mathcal{\hat{F}}$. Pairwise distance is computed on both feature maps, and their differences are measured to identify the most stable top-$K$ pairs surviving edits, which serve as the ground-truth pairs $\mathcal{E}_g$. (b) Patch Relational Learning. Given an input image, ViT features are extracted and all patch pairs are densely formed to construct fully-connected pairs $\mathcal{E}$. A learnable pair predictor $\Phi$ then estimates the stability scores of each pair, which are trained to align with the ground-truth pairs $\mathcal{E}_g$ from (a). During inference (i.e., watermark generation and verification), only the module in (b) is required to generate relational zero-watermarks $\mathcal{E}_p$.
...and 5 more figures

Rel-Zero: Harnessing Patch-Pair Invariance for Robust Zero-Watermarking Against AI Editing

Abstract

Rel-Zero: Harnessing Patch-Pair Invariance for Robust Zero-Watermarking Against AI Editing

Authors

Abstract

Table of Contents

Figures (10)