Table of Contents
Fetching ...

Understanding Semantic Perturbations on In-Processing Generative Image Watermarks

Anirudh Nakra, Min Wu

Abstract

The widespread deployment of high-fidelity generative models has intensified the need for reliable mechanisms for provenance and content authentication. In-processing watermarking, embedding a signature into the generative model's synthesis procedure, has been advocated as a solution and is often reported to be robust to standard post-processing (such as geometric transforms and filtering). Yet robustness to semantic manipulations that alter high-level scene content while maintaining reasonable visual quality is not well studied or understood. We introduce a simple, multi-stage framework for systematically stress-testing in-processing generative watermarks under semantic drift. The framework utilizes off-the-shelf models for object detection, mask generation, and semantically guided inpainting or regeneration to produce controlled, meaning-altering edits with minimal perceptual degradation. Based on extensive experiments on representative schemes, we find that robustness varies significantly with the degree of semantic entanglement: methods by which watermarks remain detectable under a broad suite of conventional perturbations can fail under semantic edits, with watermark detectability in many cases dropping to near zero while image quality remains high. Overall, our results reveal a critical gap in current watermarking evaluations and suggest that watermark designs and benchmarking must explicitly account for robustness against semantic manipulation.

Understanding Semantic Perturbations on In-Processing Generative Image Watermarks

Abstract

The widespread deployment of high-fidelity generative models has intensified the need for reliable mechanisms for provenance and content authentication. In-processing watermarking, embedding a signature into the generative model's synthesis procedure, has been advocated as a solution and is often reported to be robust to standard post-processing (such as geometric transforms and filtering). Yet robustness to semantic manipulations that alter high-level scene content while maintaining reasonable visual quality is not well studied or understood. We introduce a simple, multi-stage framework for systematically stress-testing in-processing generative watermarks under semantic drift. The framework utilizes off-the-shelf models for object detection, mask generation, and semantically guided inpainting or regeneration to produce controlled, meaning-altering edits with minimal perceptual degradation. Based on extensive experiments on representative schemes, we find that robustness varies significantly with the degree of semantic entanglement: methods by which watermarks remain detectable under a broad suite of conventional perturbations can fail under semantic edits, with watermark detectability in many cases dropping to near zero while image quality remains high. Overall, our results reveal a critical gap in current watermarking evaluations and suggest that watermark designs and benchmarking must explicitly account for robustness against semantic manipulation.

Paper Structure

This paper contains 26 sections, 3 equations, 13 figures, 5 tables.

Figures (13)

  • Figure 1: Representative in-processing generative image watermarks such as Stable-Signature fernandez2023stable, Tree-Ring wen2023tree, and Gaussian Shading yang2024gaussian aim to embed the watermarking signal during the LDM generation process. We study the resilience of these watermarks against a variety of semantic perturbations, aiming to quantify the level of semantic entanglement achieved by these methods.
  • Figure 2: Overview of evaluation: (a) Examples of different improved image-processing-based perturbations. (b) The modular pipeline used to generate global & local semantic perturbations.
  • Figure 3: An overview of semantic perturbations across representative in-processing watermarking methods.
  • Figure 4: Experimental results of watermark detectability vs visual fidelity in SSIM under enhanced image-processing-based perturbations, showing general robustness under these manipulations. Detection metrics are bit accuracy for StableSig/Gaussian shading and the normalized negative logarithm of $p$-value for tree-ring. Color indicates the perturbation strength, and markers indicate the watermarking schemes. Note that we normalize the $p$-values for understanding the trend of the tree-ring detection metric with perturbation strength. Extended results in tabular form are provided in the supplemental material.
  • Figure 5: Experimental results of watermark detectability against visual fidelity in SSIM ($y$-axis) and semantic drift ($x$-axis) under semantic perturbations. Color indicates detectability (TPR@0.1%). Results show watermark detectability can collapse across methods under varying levels of semantic drift, revealing a gap not captured by conventional robustness tests.
  • ...and 8 more figures