Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection
Minzhou Pan, Zhenting Wang, Xin Dong, Vikash Sehwag, Lingjuan Lyu, Xue Lin
TL;DR
This paper addresses the challenge of detecting invisible image watermarks without access to decoding algorithms or labeled examples in a black-box setting. It introduces WaterMark Detection (WMD), a self-supervised detector that uses distribution offsets between a detection dataset and a clean reference, together with an asymmetric loss and iterative pruning to separate watermarked from non-watermarked images. Across multiple datasets and watermarking methods, including post-processing and generative watermarks, WMD achieves high detection performance, with AUC frequently exceeding 0.9 for single watermarking and remaining above 0.7 in more challenging multi-watermark scenarios. The work highlights the method's potential to improve accountability and trust in AI-generated content while acknowledging limitations such as distribution-mismatch sensitivity and hyperparameter tuning, and outlining future directions like domain adaptation.
Abstract
In this paper, we propose WaterMark Detection (WMD), the first invisible watermark detection method under a black-box and annotation-free setting. WMD is capable of detecting arbitrary watermarks within a given reference dataset using a clean non-watermarked dataset as a reference, without relying on specific decoding methods or prior knowledge of the watermarking techniques. We develop WMD using foundations of offset learning, where a clean non-watermarked dataset enables us to isolate the influence of only watermarked samples in the reference dataset. Our comprehensive evaluations demonstrate the effectiveness of WMD, significantly outperforming naive detection methods, which only yield AUC scores around 0.5. In contrast, WMD consistently achieves impressive detection AUC scores, surpassing 0.9 in most single-watermark datasets and exceeding 0.7 in more challenging multi-watermark scenarios across diverse datasets and watermarking methods. As invisible watermarks become increasingly prevalent, while specific decoding techniques remain undisclosed, our approach provides a versatile solution and establishes a path toward increasing accountability, transparency, and trust in our digital visual content.
