When Denoising Becomes Unsigning: Theoretical and Empirical Analysis of Watermark Fragility Under Diffusion-Based Image Editing

Fai Gu; Qiyu Tang; Te Wen; Emily Davis; Finn Carter

When Denoising Becomes Unsigning: Theoretical and Empirical Analysis of Watermark Fragility Under Diffusion-Based Image Editing

Fai Gu, Qiyu Tang, Te Wen, Emily Davis, Finn Carter

TL;DR

A unified view of diffusion editors that inject substantial Gaussian noise in a latent space and project back to the natural image manifold via learned denoising dynamics is developed, proving that for broad classes of pixel-level watermark encoders/decoders the mutual information between the watermark payload and the edited output decays toward zero as the editing strength increases, yielding decoding error close to random guessing.

Abstract

Robust invisible watermarking systems aim to embed imperceptible payloads that remain decodable after common post-processing such as JPEG compression, cropping, and additive noise. In parallel, diffusion-based image editing has rapidly matured into a default transformation layer for modern content pipelines, enabling instruction-based editing, object insertion and composition, and interactive geometric manipulation. This paper studies a subtle but increasingly consequential interaction between these trends: diffusion-based editing procedures may unintentionally compromise, and in extreme cases practically bypass, robust watermarking mechanisms that were explicitly engineered to survive conventional distortions. We develop a unified view of diffusion editors that (i) inject substantial Gaussian noise in a latent space and (ii) project back to the natural image manifold via learned denoising dynamics. Under this view, watermark payloads behave as low-energy, high-frequency signals that are systematically attenuated by the forward diffusion step and then treated as nuisance variation by the reverse generative process. We formalize this degradation using information-theoretic tools, proving that for broad classes of pixel-level watermark encoders/decoders the mutual information between the watermark payload and the edited output decays toward zero as the editing strength increases, yielding decoding error close to random guessing. We complement the theory with a realistic hypothetical experimental protocol and tables spanning representative watermarking methods and representative diffusion editors. Finally, we discuss ethical implications, responsible disclosure norms, and concrete design guidelines for watermarking schemes that remain meaningful in the era of generative transformations.

When Denoising Becomes Unsigning: Theoretical and Empirical Analysis of Watermark Fragility Under Diffusion-Based Image Editing

TL;DR

Abstract

Paper Structure (68 sections, 4 theorems, 19 equations, 10 tables, 1 algorithm)

This paper contains 68 sections, 4 theorems, 19 equations, 10 tables, 1 algorithm.

Introduction
Contributions.
Related Work
Diffusion-based image editing
Robust invisible watermarking and deep learning-based schemes
Watermark robustness against generative transformations
Benchmarks for diffusion editing and watermark robustness
Concept erasure in diffusion models.
Methodology
Problem setting
Diffusion editing as a two-stage stochastic channel
A signal model for watermark attenuation
Pipeline-specific considerations: TF-ICON, SHINE, DragFlow
TF-ICON.
SHINE.
...and 53 more sections

Key Result

Lemma 1

Let $Z$ be generated from $X$ by eq:gaussian_step. For any two distributions $P$ and $Q$ over $X$, the KL divergence between the induced distributions over $Z$ satisfies where $P_Z$ and $Q_Z$ denote the pushforward distributions through the Gaussian channel.

Theorems & Definitions (8)

Lemma 1: KL contraction in an additive Gaussian channel
proof
Theorem 1: Information loss under diffusion editing
proof
Proposition 1: Exponential contraction across Gaussian steps
proof
Theorem 2: Fano-style lower bound on bit error
proof

When Denoising Becomes Unsigning: Theoretical and Empirical Analysis of Watermark Fragility Under Diffusion-Based Image Editing

TL;DR

Abstract

When Denoising Becomes Unsigning: Theoretical and Empirical Analysis of Watermark Fragility Under Diffusion-Based Image Editing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (8)