Table of Contents
Fetching ...

SERUM: Simple, Efficient, Robust, and Unifying Marking for Diffusion-based Image Generation

Jan Kociszewski, Hubert Jastrzębski, Tymoteusz Stępkowski, Filip Manijak, Krzysztof Rojek, Franziska Boenisch, Adam Dziedzic

Abstract

We propose SERUM: an intriguingly simple yet highly effective method for marking images generated by diffusion models (DMs). We only add a unique watermark noise to the initial diffusion generation noise and train a lightweight detector to identify watermarked images, simplifying and unifying the strengths of prior approaches. SERUM provides robustness against any image augmentations or watermark removal attacks and is extremely efficient, all while maintaining negligible impact on image quality. In contrast to prior approaches, which are often only resilient to limited perturbations and incur significant training, injection, and detection costs, our SERUM achieves remarkable performance, with the highest true positive rate (TPR) at a 1% false positive rate (FPR) in most scenarios, along with fast injection and detection and low detector training overhead. Its decoupled architecture also seamlessly supports multiple users by embedding individualized watermarks with little interference between the marks. Overall, our method provides a practical solution to mark outputs from DMs and to reliably distinguish generated from natural images.

SERUM: Simple, Efficient, Robust, and Unifying Marking for Diffusion-based Image Generation

Abstract

We propose SERUM: an intriguingly simple yet highly effective method for marking images generated by diffusion models (DMs). We only add a unique watermark noise to the initial diffusion generation noise and train a lightweight detector to identify watermarked images, simplifying and unifying the strengths of prior approaches. SERUM provides robustness against any image augmentations or watermark removal attacks and is extremely efficient, all while maintaining negligible impact on image quality. In contrast to prior approaches, which are often only resilient to limited perturbations and incur significant training, injection, and detection costs, our SERUM achieves remarkable performance, with the highest true positive rate (TPR) at a 1% false positive rate (FPR) in most scenarios, along with fast injection and detection and low detector training overhead. Its decoupled architecture also seamlessly supports multiple users by embedding individualized watermarks with little interference between the marks. Overall, our method provides a practical solution to mark outputs from DMs and to reliably distinguish generated from natural images.
Paper Structure (42 sections, 21 equations, 8 figures, 13 tables, 2 algorithms)

This paper contains 42 sections, 21 equations, 8 figures, 13 tables, 2 algorithms.

Figures (8)

  • Figure 1: Overview of SERUM. First, we train a watermark detector to distinguish between watermarked latents (with/without augmentations) from clean latents. Second, to inject the SERUM watermark, the watermark noise is added to the initial random Gaussian noise for diffusion generation and passed through the LDM (Latent Diffusion Model) and decoder to produce a watermarked image. Finally, in order to detect the watermark, the image is encoded using the LDM’s encoder and passed through the detector, which outputs a high score for watermarked and a low score for clean images.
  • Figure 2: Qualitative analysis of generation quality. We present outputs from the SD 2.1 model without (Clean) and with our SERUM (denoted as Ours with the parameter $\alpha=0.5$) watermark. The most important image qualities like style and content are preserved while slightly modifying shape or perspective. More examples can be found in Appendix \ref{['fig:qualitative_results']}.
  • Figure 3: Performance of SERUM in a multi-user setting.
  • Figure 4: Evaluation of radioactivity.
  • Figure 5: Watermark detector architecture.
  • ...and 3 more figures