OSI: One-step Inversion Excels in Extracting Diffusion Watermarks
Yuwei Chen, Zhenliang He, Jia Tang, Meina Kan, Shiguang Shan
TL;DR
This work tackles the inefficiency of extracting Gaussian Shading watermarks in diffusion-generated images by reframing watermark extraction as a one-shot sign classification problem. It introduces One-step Inversion (OSI), a learnable extractor initialized from the diffusion backbone and trained on synthesized noise–image pairs with a sign-classification objective, achieving ~20x faster extraction and higher accuracy with doubled payload capacity. The approach unifies a robust taxonomy via a communication-system perspective, showing OSI generalizes across diffusion backbones, schedulers, and cryptographic schemes. Empirically, OSI outperforms multi-step inversion across SD2.1/XL/3.5 models, maintains robustness under distortions and advanced attacks, and remains adaptable to broader diffusion watermarking settings with favorable amortized compute at scale.
Abstract
Watermarking is an important mechanism for provenance and copyright protection of diffusion-generated images. Training-free methods, exemplified by Gaussian Shading, embed watermarks into the initial noise of diffusion models with negligible impact on the quality of generated images. However, extracting this type of watermark typically requires multi-step diffusion inversion to obtain precise initial noise, which is computationally expensive and time-consuming. To address this issue, we propose One-step Inversion (OSI), a significantly faster and more accurate method for extracting Gaussian Shading style watermarks. OSI reformulates watermark extraction as a learnable sign classification problem, which eliminates the need for precise regression of the initial noise. Then, we initialize the OSI model from the diffusion backbone and finetune it on synthesized noise-image pairs with a sign classification objective. In this manner, the OSI model is able to accomplish the watermark extraction efficiently in only one step. Our OSI substantially outperforms the multi-step diffusion inversion method: it is 20x faster, achieves higher extraction accuracy, and doubles the watermark payload capacity. Extensive experiments across diverse schedulers, diffusion backbones, and cryptographic schemes consistently show improvements, demonstrating the generality of our OSI framework.
