WMAdapter: Adding WaterMark Control to Latent Diffusion Models

Hai Ci; Yiren Song; Pei Yang; Jinheng Xie; Mike Zheng Shou

WMAdapter: Adding WaterMark Control to Latent Diffusion Models

Hai Ci, Yiren Song, Pei Yang, Jinheng Xie, Mike Zheng Shou

TL;DR

WMAdapter presents a plug-and-play watermarking solution for latent diffusion models by introducing a lightweight contextual adapter that can imprint arbitrary watermark bits during generation without per-watermark finetuning. It leverages a two-stage training regime, including a novel hybrid finetuning that jointly tunes the adapter and a fixed VAE to suppress tiny artifacts while preserving sharpness. Empirical results show competitive bit accuracy and near-perfect tracing across large user pools, with superior image quality (PSNR/SSIM) and competitive robustness compared to post-hoc and diffusion-native baselines. The approach enables scalable, high-fidelity watermarking with potential zero-shot transfer across different VAEs and diffusion variants, albeit with some artifacts in certain finetuning settings that warrant further refinement.

Abstract

Watermarking is crucial for protecting the copyright of AI-generated images. We propose WMAdapter, a diffusion model watermark plugin that takes user-specified watermark information and allows for seamless watermark imprinting during the diffusion generation process. WMAdapter is efficient and robust, with a strong emphasis on high generation quality. To achieve this, we make two key designs: (1) We develop a contextual adapter structure that is lightweight and enables effective knowledge transfer from heavily pretrained post-hoc watermarking models. (2) We introduce an extra finetuning step and design a hybrid finetuning strategy to further improve image quality and eliminate tiny artifacts. Empirical results demonstrate that WMAdapter offers strong flexibility, exceptional image generation quality and competitive watermark robustness.

WMAdapter: Adding WaterMark Control to Latent Diffusion Models

TL;DR

Abstract

Paper Structure (35 sections, 2 equations, 14 figures, 6 tables)

This paper contains 35 sections, 2 equations, 14 figures, 6 tables.

Introduction
Related work
Post-hoc watermarking
Diffusion-native watermarking
Method
Framework overview
Contextual adapters
Training
Hybrid finetuning
Discussion
Experiment
Experimental setup
Model and dataset
Training strategies
Evaluation metric
...and 20 more sections

Figures (14)

Figure 1: Framework overview. WMAdapter is plugged onto the VAE decoder. It takes user input watermark bits and image features from the VAE decoder, imprinting the watermark on-the-fly during VAE decoding. In contrast, traditional non-contextual adapters take only watermark conditions as input. WMAdapter can be trained with a post-hoc watermark decoder for efficient knowledge transfer. The image and icons credit to sdonlineflaticon.
Figure 2: The architecture of WMAdapter. Left: The structure of WMAdapter. It comprises several independent Fusers with identical structures. Right: The structure of Fuser. It consists of a watermark Embedding module and a Fusing module.
Figure 3: Illustration of 3 different finetunig strategies. They differ in how to treat the VAE decoder.
Figure 3: Accuracy of tracing different numbers of keys. All methods are evaluated on COCO dataset lin2014microsoft. For WADIFF$^{\ast}$min2024watermark, we use the number reported by its original paper.
Figure 4: WMAdapter-F against auto-encoder watermark removal ballemshj18cheng2020image.
...and 9 more figures

WMAdapter: Adding WaterMark Control to Latent Diffusion Models

TL;DR

Abstract

WMAdapter: Adding WaterMark Control to Latent Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Figures (14)