Table of Contents
Fetching ...

Attack-Resilient Image Watermarking Using Stable Diffusion

Lijun Zhang, Xiao Liu, Antoni Viros Martin, Cindy Xiong Bearfield, Yuriy Brun, Hui Guan

TL;DR

ZoDiac is presented, which uses a pre-trained stable diffusion model to inject a watermark into the trainable latent space, resulting in watermarks that can be reliably detected in the latent vector even when attacked.

Abstract

Watermarking images is critical for tracking image provenance and proving ownership. With the advent of generative models, such as stable diffusion, that can create fake but realistic images, watermarking has become particularly important to make human-created images reliably identifiable. Unfortunately, the very same stable diffusion technology can remove watermarks injected using existing methods. To address this problem, we present ZoDiac, which uses a pre-trained stable diffusion model to inject a watermark into the trainable latent space, resulting in watermarks that can be reliably detected in the latent vector even when attacked. We evaluate ZoDiac on three benchmarks, MS-COCO, DiffusionDB, and WikiArt, and find that ZoDiac is robust against state-of-the-art watermark attacks, with a watermark detection rate above 98% and a false positive rate below 6.4%, outperforming state-of-the-art watermarking methods. We hypothesize that the reciprocating denoising process in diffusion models may inherently enhance the robustness of the watermark when faced with strong attacks and validate the hypothesis. Our research demonstrates that stable diffusion is a promising approach to robust watermarking, able to withstand even stable-diffusion--based attack methods. ZoDiac is open-sourced and available at https://github.com/zhanglijun95/ZoDiac.

Attack-Resilient Image Watermarking Using Stable Diffusion

TL;DR

ZoDiac is presented, which uses a pre-trained stable diffusion model to inject a watermark into the trainable latent space, resulting in watermarks that can be reliably detected in the latent vector even when attacked.

Abstract

Watermarking images is critical for tracking image provenance and proving ownership. With the advent of generative models, such as stable diffusion, that can create fake but realistic images, watermarking has become particularly important to make human-created images reliably identifiable. Unfortunately, the very same stable diffusion technology can remove watermarks injected using existing methods. To address this problem, we present ZoDiac, which uses a pre-trained stable diffusion model to inject a watermark into the trainable latent space, resulting in watermarks that can be reliably detected in the latent vector even when attacked. We evaluate ZoDiac on three benchmarks, MS-COCO, DiffusionDB, and WikiArt, and find that ZoDiac is robust against state-of-the-art watermark attacks, with a watermark detection rate above 98% and a false positive rate below 6.4%, outperforming state-of-the-art watermarking methods. We hypothesize that the reciprocating denoising process in diffusion models may inherently enhance the robustness of the watermark when faced with strong attacks and validate the hypothesis. Our research demonstrates that stable diffusion is a promising approach to robust watermarking, able to withstand even stable-diffusion--based attack methods. ZoDiac is open-sourced and available at https://github.com/zhanglijun95/ZoDiac.
Paper Structure (31 sections, 9 equations, 13 figures, 20 tables, 1 algorithm)

This paper contains 31 sections, 9 equations, 13 figures, 20 tables, 1 algorithm.

Figures (13)

  • Figure 1: The watermark detection rate of existing methods and our ZoDiac before and after the diffusion-based attack Zhao23zhao2023generative. Two example images show that ZoDiac's watermarks are perceptually invisible.
  • Figure 2: Overview of ZoDiac with watermark embedding and detection phases. There are three major steps in the embedding phase: 1) latent vector initialization, 2) watermark encoding, and 3) adaptive image enhancement. In the detection phase, the watermark is decoded by performing DDIM inversion, Fourier transformation, and statistical testing.
  • Figure 2: The effects of varying denoising steps on image quality (PSNR) and watermark detection rate (WDR). The denoising step of 0 means utilizing only the image autoencoder in the diffusion model without the diffusion process.
  • Figure 3: The trade-off between the watermarked image quality (SSIM) and the watermark detection rate (WDR) on MS-COCO dataset. The image quality is controlled by SSIM threshold $s^*\in[0.8,0.98]$ in increments of $0.03$ and the robustness is evaluated post-attack with four advanced attack methods.
  • Figure 4: The PSNR of attacked watermarked images compared to those without being attacked and the WDR of ZoDiac and StegaStamp under different strengths from 0.2 to 1.0 (i.e., no attack) of the Brightness attack (left) and the Contrast (right) attack.
  • ...and 8 more figures