Table of Contents
Fetching ...

JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits

Minzhou Pan, Yi Zeng, Xue Lin, Ning Yu, Cho-Jui Hsieh, Peter Henderson, Ruoxi Jia

TL;DR

This work tackles the vulnerability of image watermarks to diffusion-model edits by introducing JigMark, a black-box watermarking framework that learns robust watermarks through contrastive training on original and diffusion-perturbed image pairs without gradient propagation through the perturbations. A novel Jigsaw-based embedding enables per-holder keys and rapid deployment, while the HAV metric provides a human-aligned measure of diffusion strength and perturbation impact. JigMark consistently outperforms traditional and diffusion-integrated baselines in watermark detectability under diffusion edits and conventional perturbations, and maintains perceptual image quality. The approach offers practical resilience for IP protection in the era of accessible diffusion-based image editing, supported by extensive evaluations and an open-source codebase, albeit with significant initial training overhead and some limitations against extreme transformations.

Abstract

In this study, we investigate the vulnerability of image watermarks to diffusion-model-based image editing, a challenge exacerbated by the computational cost of accessing gradient information and the closed-source nature of many diffusion models. To address this issue, we introduce JIGMARK. This first-of-its-kind watermarking technique enhances robustness through contrastive learning with pairs of images, processed and unprocessed by diffusion models, without needing a direct backpropagation of the diffusion process. Our evaluation reveals that JIGMARK significantly surpasses existing watermarking solutions in resilience to diffusion-model edits, demonstrating a True Positive Rate more than triple that of leading baselines at a 1% False Positive Rate while preserving image quality. At the same time, it consistently improves the robustness against other conventional perturbations (like JPEG, blurring, etc.) and malicious watermark attacks over the state-of-the-art, often by a large margin. Furthermore, we propose the Human Aligned Variation (HAV) score, a new metric that surpasses traditional similarity measures in quantifying the number of image derivatives from image editing.

JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits

TL;DR

This work tackles the vulnerability of image watermarks to diffusion-model edits by introducing JigMark, a black-box watermarking framework that learns robust watermarks through contrastive training on original and diffusion-perturbed image pairs without gradient propagation through the perturbations. A novel Jigsaw-based embedding enables per-holder keys and rapid deployment, while the HAV metric provides a human-aligned measure of diffusion strength and perturbation impact. JigMark consistently outperforms traditional and diffusion-integrated baselines in watermark detectability under diffusion edits and conventional perturbations, and maintains perceptual image quality. The approach offers practical resilience for IP protection in the era of accessible diffusion-based image editing, supported by extensive evaluations and an open-source codebase, albeit with significant initial training overhead and some limitations against extreme transformations.

Abstract

In this study, we investigate the vulnerability of image watermarks to diffusion-model-based image editing, a challenge exacerbated by the computational cost of accessing gradient information and the closed-source nature of many diffusion models. To address this issue, we introduce JIGMARK. This first-of-its-kind watermarking technique enhances robustness through contrastive learning with pairs of images, processed and unprocessed by diffusion models, without needing a direct backpropagation of the diffusion process. Our evaluation reveals that JIGMARK significantly surpasses existing watermarking solutions in resilience to diffusion-model edits, demonstrating a True Positive Rate more than triple that of leading baselines at a 1% False Positive Rate while preserving image quality. At the same time, it consistently improves the robustness against other conventional perturbations (like JPEG, blurring, etc.) and malicious watermark attacks over the state-of-the-art, often by a large margin. Furthermore, we propose the Human Aligned Variation (HAV) score, a new metric that surpasses traditional similarity measures in quantifying the number of image derivatives from image editing.
Paper Structure (37 sections, 5 equations, 17 figures, 21 tables, 1 algorithm)

This paper contains 37 sections, 5 equations, 17 figures, 21 tables, 1 algorithm.

Figures (17)

  • Figure 1: Left: image sample from different HAV values. As the $\textsc{HAV}$ gets higher, the image becomes more dissimilar. Right: The trade-off between editingand detectability. As Human Aligned Variations ($\textsc{HAV}$) increase, $\textsc{JigMark}$ maintains higher detection AUC than baseline methods 4554423zhu2018hiddenzhu2018hiddenfernandez2023stable.
  • Figure 2: Decoder outputs "1" only when the watermarked image is in the correct order via the shuffling key. If the order is wrong, despite the presence of the watermark, the decoder outputs "0".
  • Figure 3: The training process of $\textsc{JigMark}$ can be seen as two phases, A. Phase 1: Sampling positive and negative examples and B. Phase 2: Leveraging the difference between the positive and negative samples to train the encoder and decoder via contrastive learning.
  • Figure 4: Analysis of artwork cases deemed as plagiarized in court. (a) Andy Warhol Foundation for Visual Arts, Inc. v. Goldsmith warhol_2023, (b) Sedlik v. Von Drachenberg sedlik_von_drachenberg_2022, (c) Dr. Seuss Enterprises, L.P. v. ComicMix LLC seuss_vs_comicmix.
  • Figure 5: Visual comparison of watermarking techniques: watermarked images on the left and magnified deficits (×10) on the right. $\textsc{JigMark}$ distinctively embeds low-frequency noise in less noticeable areas like boundaries and textures
  • ...and 12 more figures