On the Difficulty of Constructing a Robust and Publicly-Detectable Watermark
Jaiden Fairoze, Guillermo Ortiz-Jimenez, Mel Vecerik, Somesh Jha, Sven Gowal
TL;DR
The paper investigates the feasibility of a robust, unforgeable, and publicly-detectable watermark for image provenance by formalizing a RPWS framework built from a robust embedding function ($\mathsf{REF}$), a post-hoc watermarking scheme ($\mathsf{PGWS}$), and cryptographic signatures ($\mathsf{SIG}$). It proves that, given a $(\mathcal{T}_{REF}, m, n, \epsilon_{REF})$-robust embedding, a $(\mathcal{T}_{PGWS}, c, \epsilon_{PGWS})$-post-hoc watermark with $c \ge \delta + n$, and a $(\delta, \lambda)$-signature, one can construct a $(\mathcal{T}_{REF} \cap \mathcal{T}_{PGWS}, \epsilon_{REF} + \epsilon_{PGWS} + \negl(\lambda))$-RPWS. However, deploying such a scheme in practice is limited by white-box vulnerabilities of current image embedding models, as adversaries can break embedding collision resistance or undermine similarity checks; higher-performing models show more resistance but do not yet suffice. The paper discusses real-world instantiation challenges and proposes directions to improve compact cryptographic signatures, robust embeddings, and high-capacity post-hoc watermarks, alongside exploring alternative public-detection pathways. Overall, this work lays a formal foundation for combining cryptographic security with deep-learning robustness for provenance, while highlighting substantial practical hurdles and avenues for future research in adversarial robustness and capacity management.
Abstract
This work investigates the theoretical boundaries of creating publicly-detectable schemes to enable the provenance of watermarked imagery. Metadata-based approaches like C2PA provide unforgeability and public-detectability. ML techniques offer robust retrieval and watermarking. However, no existing scheme combines robustness, unforgeability, and public-detectability. In this work, we formally define such a scheme and establish its existence. Although theoretically possible, we find that at present, it is intractable to build certain components of our scheme without a leap in deep learning capabilities. We analyze these limitations and propose research directions that need to be addressed before we can practically realize robust and publicly-verifiable provenance.
