RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees

Xun Xian; Ganghua Wang; Xuan Bi; Jayanth Srinivasa; Ashish Kundu; Mingyi Hong; Jie Ding

RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees

Xun Xian, Ganghua Wang, Xuan Bi, Jayanth Srinivasa, Ashish Kundu, Mingyi Hong, Jie Ding

TL;DR

A robust and agile plug-and-play watermark detection framework, dubbed as RAW, which introduces learnable watermarks directly into the original image data and provides provable guarantees regarding the false positive rate for misclassifying a watermarked image, even in the presence of certain adversarial attacks targeting watermark removal.

Abstract

Safeguarding intellectual property and preventing potential misuse of AI-generated images are of paramount importance. This paper introduces a robust and agile plug-and-play watermark detection framework, dubbed as RAW. As a departure from traditional encoder-decoder methods, which incorporate fixed binary codes as watermarks within latent representations, our approach introduces learnable watermarks directly into the original image data. Subsequently, we employ a classifier that is jointly trained with the watermark to detect the presence of the watermark. The proposed framework is compatible with various generative architectures and supports on-the-fly watermark injection after training. By incorporating state-of-the-art smoothing techniques, we show that the framework provides provable guarantees regarding the false positive rate for misclassifying a watermarked image, even in the presence of certain adversarial attacks targeting watermark removal. Experiments on a diverse range of images generated by state-of-the-art diffusion models reveal substantial performance enhancements compared to existing approaches. For instance, our method demonstrates a notable increase in AUROC, from 0.48 to 0.82, when compared to state-of-the-art approaches in detecting watermarked images under adversarial attacks, while maintaining image quality, as indicated by closely aligned FID and CLIP scores.

RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees

TL;DR

Abstract

Paper Structure (17 sections, 2 theorems, 11 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 17 sections, 2 theorems, 11 equations, 3 figures, 4 tables, 1 algorithm.

Introduction
Contributions
Related Work
Preliminary
RAW
Training stage
Overall Training Algorithm
Further Discussions
Inference Stage
Overall Inference Algorithm
Experiments
Experimental setups
Clean detection performance and image generation quality
Robust detection performance
Watermark embedding speed
...and 2 more sections

Key Result

Lemma 1

Let $h: \mathbb{R} \rightarrow[0,1]$ be a continuous function. Let $\sigma>0$, and $H(x) = \underset{Z \sim \mathcal{N}\left(0, \sigma^{2} I\right)}{\mathbb{E}}[h(X+Z)]$. Then the function $\Phi^{-1}(H(X))$ is $\sigma^{-1}$-Lipschitz.

Figures (3)

Figure 1: Illustration of our proposed RAW (top row) and popular encoder-decoder based watermarking schemes (bottom row).
Figure 2: Effects of (a) jointly training watermarks and models and (b) using spatial watermarks on training loss and test accuracy.
Figure 3: Examples of RAW-watermarked images (bottom row).

Theorems & Definitions (7)

Remark 1: Watermarks can be generated by Alice and/or Bob.
Definition 1: Watermarking Module
Definition 2: Verification Module
Definition 3: Modification Module
Lemma 1: salman2019provably
Remark 2: $\mathcal{A}$ can not be excessively adversarial
Theorem 1: Certified FPR of $g$ based on threshold in Equation (\ref{['quantile_selection']})

RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees

TL;DR

Abstract

RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (7)