Table of Contents
Fetching ...

MarkPlugger: Generalizable Watermark Framework for Latent Diffusion Models without Retraining

Guokai Zhang, Lanjun Wang, Yuting Su, An-An Liu

TL;DR

This work proposes MarkPlugger, a generalizable plug-and-play watermark framework without LDM retraining that effectively harmonizes image quality and watermark recovery rate and is generalized to multiple official versions and modified variants of LDMs, even without retraining the watermark model.

Abstract

Today, the family of latent diffusion models (LDMs) has gained prominence for its high quality outputs and scalability. This has also raised security concerns on social media, as malicious users can create and disseminate harmful content. Existing approaches typically involve training specific components or entire generative models to embed a watermark in generated images for traceability and responsibility. However, in the fast-evolving era of AI-generated content (AIGC), the rapid iteration and modification of LDMs makes retraining with watermark models costly. To address the problem, we propose MarkPlugger, a generalizable plug-and-play watermark framework without LDM retraining. In particular, to reduce the disturbance of the watermark on the semantics of the generated image, we try to identify a watermark representation that is approaching orthogonal to the semantic in latent space, and apply an additive fusion strategy for the watermark and the semantic. Without modifying any components of the LDMs, we embed diverse watermarks in latent space, adapting to the denoising process. Our experimental findings reveal that our method effectively harmonizes image quality and watermark recovery rate. We also have validated that our method is generalized to multiple official versions and modified variants of LDMs, even without retraining the watermark model. Furthermore, it performs robustly under various attacks of different intensities.

MarkPlugger: Generalizable Watermark Framework for Latent Diffusion Models without Retraining

TL;DR

This work proposes MarkPlugger, a generalizable plug-and-play watermark framework without LDM retraining that effectively harmonizes image quality and watermark recovery rate and is generalized to multiple official versions and modified variants of LDMs, even without retraining the watermark model.

Abstract

Today, the family of latent diffusion models (LDMs) has gained prominence for its high quality outputs and scalability. This has also raised security concerns on social media, as malicious users can create and disseminate harmful content. Existing approaches typically involve training specific components or entire generative models to embed a watermark in generated images for traceability and responsibility. However, in the fast-evolving era of AI-generated content (AIGC), the rapid iteration and modification of LDMs makes retraining with watermark models costly. To address the problem, we propose MarkPlugger, a generalizable plug-and-play watermark framework without LDM retraining. In particular, to reduce the disturbance of the watermark on the semantics of the generated image, we try to identify a watermark representation that is approaching orthogonal to the semantic in latent space, and apply an additive fusion strategy for the watermark and the semantic. Without modifying any components of the LDMs, we embed diverse watermarks in latent space, adapting to the denoising process. Our experimental findings reveal that our method effectively harmonizes image quality and watermark recovery rate. We also have validated that our method is generalized to multiple official versions and modified variants of LDMs, even without retraining the watermark model. Furthermore, it performs robustly under various attacks of different intensities.
Paper Structure (40 sections, 10 equations, 11 figures, 3 tables)

This paper contains 40 sections, 10 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: Comparison of our proposed framework with existing methods. One type of method like (a) co-trains the watermark model with the LDM, requiring retraining whenever the model is updating. Another type of method like (b) pre-embeds the watermark into the train data. The one-to-one correspondence between the watermark and the LDM decide the inflexibility of watermark embedding. Therefore, the generalizable plug-and-play watermark without retraining LDMs is proposed, as dipicted in (c).
  • Figure 2: Overview of our proposed MarkPlugger framework for LDMs.
  • Figure 3: Robustness under various attacks with different perturbation strengths on COCO. A darker hue indicates a more potent attack.
  • Figure 4: Robustness under various attacks with different perturbation strengths on Flikr-8K with the same setting as Fig. \ref{['fig:robust']}.
  • Figure 5: Parameter analysis on COCO.
  • ...and 6 more figures