Table of Contents
Fetching ...

WMCopier: Forging Invisible Image Watermarks on Arbitrary Images

Ziping Dong, Chao Shuai, Zhongjie Ba, Peng Cheng, Zhan Qin, Qinglong Wang, Kui Ren

TL;DR

WMCopier exposes a vulnerability in invisible watermarking for AI-generated content by introducing a diffusion-model-based no-box forgery attack. It learns the watermark distribution from watermarked data, uses shallow inversion to fuse watermark signals into clean images, and applies score-based refinement to enhance fidelity to the watermark manifold while preserving content. Empirical results show strong forgery performance across open-source schemes and a deployed Amazon system, with PSNR around 30 dB and high bit-accuracy/FPR. The work also proposes a multi-message defense to raise the bar for future watermark designs, highlighting the need for robust defenses in real-world Gen-AI deployments.

Abstract

Invisible Image Watermarking is crucial for ensuring content provenance and accountability in generative AI. While Gen-AI providers are increasingly integrating invisible watermarking systems, the robustness of these schemes against forgery attacks remains poorly characterized. This is critical, as forging traceable watermarks onto illicit content leads to false attribution, potentially harming the reputation and legal standing of Gen-AI service providers who are not responsible for the content. In this work, we propose WMCopier, an effective watermark forgery attack that operates without requiring any prior knowledge of or access to the target watermarking algorithm. Our approach first models the target watermark distribution using an unconditional diffusion model, and then seamlessly embeds the target watermark into a non-watermarked image via a shallow inversion process. We also incorporate an iterative optimization procedure that refines the reconstructed image to further trade off the fidelity and forgery efficiency. Experimental results demonstrate that WMCopier effectively deceives both open-source and closed-source watermark systems (e.g., Amazon's system), achieving a significantly higher success rate than existing methods. Additionally, we evaluate the robustness of forged samples and discuss the potential defenses against our attack.

WMCopier: Forging Invisible Image Watermarks on Arbitrary Images

TL;DR

WMCopier exposes a vulnerability in invisible watermarking for AI-generated content by introducing a diffusion-model-based no-box forgery attack. It learns the watermark distribution from watermarked data, uses shallow inversion to fuse watermark signals into clean images, and applies score-based refinement to enhance fidelity to the watermark manifold while preserving content. Empirical results show strong forgery performance across open-source schemes and a deployed Amazon system, with PSNR around 30 dB and high bit-accuracy/FPR. The work also proposes a multi-message defense to raise the bar for future watermark designs, highlighting the need for robust defenses in real-world Gen-AI deployments.

Abstract

Invisible Image Watermarking is crucial for ensuring content provenance and accountability in generative AI. While Gen-AI providers are increasingly integrating invisible watermarking systems, the robustness of these schemes against forgery attacks remains poorly characterized. This is critical, as forging traceable watermarks onto illicit content leads to false attribution, potentially harming the reputation and legal standing of Gen-AI service providers who are not responsible for the content. In this work, we propose WMCopier, an effective watermark forgery attack that operates without requiring any prior knowledge of or access to the target watermarking algorithm. Our approach first models the target watermark distribution using an unconditional diffusion model, and then seamlessly embeds the target watermark into a non-watermarked image via a shallow inversion process. We also incorporate an iterative optimization procedure that refines the reconstructed image to further trade off the fidelity and forgery efficiency. Experimental results demonstrate that WMCopier effectively deceives both open-source and closed-source watermark systems (e.g., Amazon's system), achieving a significantly higher success rate than existing methods. Additionally, we evaluate the robustness of forged samples and discuss the potential defenses against our attack.

Paper Structure

This paper contains 36 sections, 13 equations, 13 figures, 12 tables, 1 algorithm.

Figures (13)

  • Figure 1: The pipeline of WMCopier. The WMCopier consists of three stages. In the first stage, an unconditional diffusion model is trained to estimate the watermark distribution. In the second stage, the estimated watermark is injected into a non-watermarked image using shallow inversion and denoising. Finally, a refinement procedure is applied to mitigate artifacts and ensure conformity to the target watermark distribution $p_w(x)$.
  • Figure 2: Watermark detectability of four open-source watermarking schemes throughout the diffusion and denoising processes ($T = 1000$). As a reference, the bit accuracy of non-watermarked images remains around 0.5.
  • Figure 3: Forged samples generated using full-step inversion, shallow inversion, and shallow inversion with refinement. The first row shows results from full-step inversion ($T_S = T = 100$), where the semantic content of the original clean image is heavily disrupted. The second row corresponds to shallow inversion ($T_S = 40$, $T = 100$), which introduces only slight artifacts. The third row demonstrates shallow inversion with refinement, where these artifacts are further reduced.
  • Figure 4: Performance comparison of baseline and WMCopier on Amazon Watermark.
  • Figure 5: Effect of refinement iterations $L$ (left) and trade off coefficient $\lambda$ (right) on PSNR and Bit-Accuracy under our forgery attacks, with fixed $\eta = 10^{-4}$.
  • ...and 8 more figures