On the Difficulty of Constructing a Robust and Publicly-Detectable Watermark

Jaiden Fairoze; Guillermo Ortiz-Jimenez; Mel Vecerik; Somesh Jha; Sven Gowal

On the Difficulty of Constructing a Robust and Publicly-Detectable Watermark

Jaiden Fairoze, Guillermo Ortiz-Jimenez, Mel Vecerik, Somesh Jha, Sven Gowal

TL;DR

The paper investigates the feasibility of a robust, unforgeable, and publicly-detectable watermark for image provenance by formalizing a RPWS framework built from a robust embedding function ($\mathsf{REF}$), a post-hoc watermarking scheme ($\mathsf{PGWS}$), and cryptographic signatures ($\mathsf{SIG}$). It proves that, given a $(\mathcal{T}_{REF}, m, n, \epsilon_{REF})$-robust embedding, a $(\mathcal{T}_{PGWS}, c, \epsilon_{PGWS})$-post-hoc watermark with $c \ge \delta + n$, and a $(\delta, \lambda)$-signature, one can construct a $(\mathcal{T}_{REF} \cap \mathcal{T}_{PGWS}, \epsilon_{REF} + \epsilon_{PGWS} + \negl(\lambda))$-RPWS. However, deploying such a scheme in practice is limited by white-box vulnerabilities of current image embedding models, as adversaries can break embedding collision resistance or undermine similarity checks; higher-performing models show more resistance but do not yet suffice. The paper discusses real-world instantiation challenges and proposes directions to improve compact cryptographic signatures, robust embeddings, and high-capacity post-hoc watermarks, alongside exploring alternative public-detection pathways. Overall, this work lays a formal foundation for combining cryptographic security with deep-learning robustness for provenance, while highlighting substantial practical hurdles and avenues for future research in adversarial robustness and capacity management.

Abstract

This work investigates the theoretical boundaries of creating publicly-detectable schemes to enable the provenance of watermarked imagery. Metadata-based approaches like C2PA provide unforgeability and public-detectability. ML techniques offer robust retrieval and watermarking. However, no existing scheme combines robustness, unforgeability, and public-detectability. In this work, we formally define such a scheme and establish its existence. Although theoretically possible, we find that at present, it is intractable to build certain components of our scheme without a leap in deep learning capabilities. We analyze these limitations and propose research directions that need to be addressed before we can practically realize robust and publicly-verifiable provenance.

On the Difficulty of Constructing a Robust and Publicly-Detectable Watermark

TL;DR

The paper investigates the feasibility of a robust, unforgeable, and publicly-detectable watermark for image provenance by formalizing a RPWS framework built from a robust embedding function (

), a post-hoc watermarking scheme (

), and cryptographic signatures (

). It proves that, given a

-robust embedding, a

-post-hoc watermark with

, and a

-signature, one can construct a

-RPWS. However, deploying such a scheme in practice is limited by white-box vulnerabilities of current image embedding models, as adversaries can break embedding collision resistance or undermine similarity checks; higher-performing models show more resistance but do not yet suffice. The paper discusses real-world instantiation challenges and proposes directions to improve compact cryptographic signatures, robust embeddings, and high-capacity post-hoc watermarks, alongside exploring alternative public-detection pathways. Overall, this work lays a formal foundation for combining cryptographic security with deep-learning robustness for provenance, while highlighting substantial practical hurdles and avenues for future research in adversarial robustness and capacity management.

On the Difficulty of Constructing a Robust and Publicly-Detectable Watermark

TL;DR

Abstract

On the Difficulty of Constructing a Robust and Publicly-Detectable Watermark

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (15)