Towards Effective User Attribution for Latent Diffusion Models via Watermark-Informed Blending
Yongyang Pan, Xiaohong Liu, Siqi Luo, Yi Xin, Xiao Guo, Xiaoming Liu, Xiongkuo Min, Guangtao Zhai
TL;DR
TEAWIB tackles unauthorized use of latent diffusion models by embedding user-specific watermarks directly into the decoder through a watermark-informed blending approach that requires no retraining and preserves high image fidelity. It introduces Dynamic Watermark Blending (DWB) and Image Quality Preservation (IQP) to achieve robust, invisible watermarks, supported by a watermark extraction loss and perceptual loss to maintain perceptual similarity. Comprehensive MS-COCO experiments show state-of-the-art image quality (PSNR ≈ 39.2 dB, SSIM ≈ 0.985, low LPIPS) and near-perfect watermark detectability (≈99% with extremely low FPR) even under post-processing and large-scale identification scenarios. The framework also demonstrates resilience to deliberate watermark removal attempts and supports scalable attribution for large user populations via a Ready-to-Use configuration, with limitations currently limited to text-to-image generation and planned extensions to other modalities.
Abstract
Rapid advancements in multimodal large language models have enabled the creation of hyper-realistic images from textual descriptions. However, these advancements also raise significant concerns about unauthorized use, which hinders their broader distribution. Traditional watermarking methods often require complex integration or degrade image quality. To address these challenges, we introduce a novel framework Towards Effective user Attribution for latent diffusion models via Watermark-Informed Blending (TEAWIB). TEAWIB incorporates a unique ready-to-use configuration approach that allows seamless integration of user-specific watermarks into generative models. This approach ensures that each user can directly apply a pre-configured set of parameters to the model without altering the original model parameters or compromising image quality. Additionally, noise and augmentation operations are embedded at the pixel level to further secure and stabilize watermarked images. Extensive experiments validate the effectiveness of TEAWIB, showcasing the state-of-the-art performance in perceptual quality and attribution accuracy.
