Table of Contents
Fetching ...

WaterVIB: Learning Minimal Sufficient Watermark Representations via Variational Information Bottleneck

Haoyuan He, Yu Zheng, Jie Zhou, Jiwen Lu

TL;DR

WaterVIB is proposed, a theoretically grounded framework that reformulates the encoder as an information sieve via the Variational Information Bottleneck, and theoretically proves that optimizing this bottleneck is a necessary condition for robustness against distribution-shifting attacks.

Abstract

Robust watermarking is critical for intellectual property protection, whereas existing methods face a severe vulnerability against regeneration-based AIGC attacks. We identify that existing methods fail because they entangle the watermark with high-frequency cover texture, which is susceptible to being rewritten during generative purification. To address this, we propose WaterVIB, a theoretically grounded framework that reformulates the encoder as an information sieve via the Variational Information Bottleneck. Instead of overfitting to fragile cover details, our approach forces the model to learn a Minimal Sufficient Statistic of the message. This effectively filters out redundant cover nuances prone to generative shifts, retaining only the essential signal invariant to regeneration. We theoretically prove that optimizing this bottleneck is a necessary condition for robustness against distribution-shifting attacks. Extensive experiments demonstrate that WaterVIB significantly outperforms state-of-the-art methods, achieving superior zero-shot resilience against unknown diffusion-based editing.

WaterVIB: Learning Minimal Sufficient Watermark Representations via Variational Information Bottleneck

TL;DR

WaterVIB is proposed, a theoretically grounded framework that reformulates the encoder as an information sieve via the Variational Information Bottleneck, and theoretically proves that optimizing this bottleneck is a necessary condition for robustness against distribution-shifting attacks.

Abstract

Robust watermarking is critical for intellectual property protection, whereas existing methods face a severe vulnerability against regeneration-based AIGC attacks. We identify that existing methods fail because they entangle the watermark with high-frequency cover texture, which is susceptible to being rewritten during generative purification. To address this, we propose WaterVIB, a theoretically grounded framework that reformulates the encoder as an information sieve via the Variational Information Bottleneck. Instead of overfitting to fragile cover details, our approach forces the model to learn a Minimal Sufficient Statistic of the message. This effectively filters out redundant cover nuances prone to generative shifts, retaining only the essential signal invariant to regeneration. We theoretically prove that optimizing this bottleneck is a necessary condition for robustness against distribution-shifting attacks. Extensive experiments demonstrate that WaterVIB significantly outperforms state-of-the-art methods, achieving superior zero-shot resilience against unknown diffusion-based editing.
Paper Structure (50 sections, 44 equations, 7 figures, 13 tables, 1 algorithm)

This paper contains 50 sections, 44 equations, 7 figures, 13 tables, 1 algorithm.

Figures (7)

  • Figure 1: Vulnerability of standard watermarking versus the robustness of our VIB method. The residual visualizations (Top) reveal a strong correlation between the watermark signal and AIGC purification, highlighting the fragility of texture-entangled methods. We visualize the Bit Accuracy (higher is better) across six AIGC editing benchmarks. To clearly demonstrate the relative improvements on tasks with varying difficulty levels, each axis is independently normalized to its effective range
  • Figure 2: The WaterVIB Architecture. We propose a Stochastic Information Sieve mechanism (Part 2) to defend against generative purification (Part 1). By injecting noise via a learnable bottleneck layer, WaterVIB penalizes the retention of cover-specific details ($I(Z;X)$) via the Information Bottleneck principle. This explicitly disentangles the watermark signal from the cover texture, yielding a stochastic representation $\mathbf{U}$ that is invariant to the semantic projections performed by diffusion models.
  • Figure 3: Feature Space Visualization (t-SNE). We visualize the latent embeddings of 10 random messages (different colors), each embedded into 20 cover images. In the Baseline (EditGuard), attacked samples (triangles) undergo significant feature drift, collapsing toward a shared manifold region regardless of their original message identity, which leads to high bit error rates. (b) With our WaterVIB, the drift paths (lines) are significantly reduced, and attacked samples remain anchored within the high-density clusters of their respective clean counterparts.
  • Figure 4: Generalization Gap Analysis. We evaluate the training dynamics by plotting the ratio of validation loss to training loss ($\mathcal{L}_{\text{val}} / \mathcal{L}_{\text{train}}$) across epochs. A ratio significantly greater than $1$ indicates overfitting. The plots correspond to the Encoder (top), Decoder (middle), and Total Loss (bottom).
  • Figure 5: Impact of $\beta$ on Robustness. We plot the BER under AIGC purification as a function of $\beta$. The Baseline coresponds to $\beta=0$. The curve reveals a distinct "sweet spot" at $\beta=0.00015$.
  • ...and 2 more figures