Table of Contents
Fetching ...

On the Difficulty of Constructing a Robust and Publicly-Detectable Watermark

Jaiden Fairoze, Guillermo Ortiz-Jimenez, Mel Vecerik, Somesh Jha, Sven Gowal

TL;DR

The paper investigates the feasibility of a robust, unforgeable, and publicly-detectable watermark for image provenance by formalizing a RPWS framework built from a robust embedding function ($\mathsf{REF}$), a post-hoc watermarking scheme ($\mathsf{PGWS}$), and cryptographic signatures ($\mathsf{SIG}$). It proves that, given a $(\mathcal{T}_{REF}, m, n, \epsilon_{REF})$-robust embedding, a $(\mathcal{T}_{PGWS}, c, \epsilon_{PGWS})$-post-hoc watermark with $c \ge \delta + n$, and a $(\delta, \lambda)$-signature, one can construct a $(\mathcal{T}_{REF} \cap \mathcal{T}_{PGWS}, \epsilon_{REF} + \epsilon_{PGWS} + \negl(\lambda))$-RPWS. However, deploying such a scheme in practice is limited by white-box vulnerabilities of current image embedding models, as adversaries can break embedding collision resistance or undermine similarity checks; higher-performing models show more resistance but do not yet suffice. The paper discusses real-world instantiation challenges and proposes directions to improve compact cryptographic signatures, robust embeddings, and high-capacity post-hoc watermarks, alongside exploring alternative public-detection pathways. Overall, this work lays a formal foundation for combining cryptographic security with deep-learning robustness for provenance, while highlighting substantial practical hurdles and avenues for future research in adversarial robustness and capacity management.

Abstract

This work investigates the theoretical boundaries of creating publicly-detectable schemes to enable the provenance of watermarked imagery. Metadata-based approaches like C2PA provide unforgeability and public-detectability. ML techniques offer robust retrieval and watermarking. However, no existing scheme combines robustness, unforgeability, and public-detectability. In this work, we formally define such a scheme and establish its existence. Although theoretically possible, we find that at present, it is intractable to build certain components of our scheme without a leap in deep learning capabilities. We analyze these limitations and propose research directions that need to be addressed before we can practically realize robust and publicly-verifiable provenance.

On the Difficulty of Constructing a Robust and Publicly-Detectable Watermark

TL;DR

The paper investigates the feasibility of a robust, unforgeable, and publicly-detectable watermark for image provenance by formalizing a RPWS framework built from a robust embedding function (), a post-hoc watermarking scheme (), and cryptographic signatures (). It proves that, given a -robust embedding, a -post-hoc watermark with , and a -signature, one can construct a -RPWS. However, deploying such a scheme in practice is limited by white-box vulnerabilities of current image embedding models, as adversaries can break embedding collision resistance or undermine similarity checks; higher-performing models show more resistance but do not yet suffice. The paper discusses real-world instantiation challenges and proposes directions to improve compact cryptographic signatures, robust embeddings, and high-capacity post-hoc watermarks, alongside exploring alternative public-detection pathways. Overall, this work lays a formal foundation for combining cryptographic security with deep-learning robustness for provenance, while highlighting substantial practical hurdles and avenues for future research in adversarial robustness and capacity management.

Abstract

This work investigates the theoretical boundaries of creating publicly-detectable schemes to enable the provenance of watermarked imagery. Metadata-based approaches like C2PA provide unforgeability and public-detectability. ML techniques offer robust retrieval and watermarking. However, no existing scheme combines robustness, unforgeability, and public-detectability. In this work, we formally define such a scheme and establish its existence. Although theoretically possible, we find that at present, it is intractable to build certain components of our scheme without a leap in deep learning capabilities. We analyze these limitations and propose research directions that need to be addressed before we can practically realize robust and publicly-verifiable provenance.

Paper Structure

This paper contains 23 sections, 4 theorems, 6 equations, 8 figures, 1 table.

Key Result

Theorem 4.1

The scheme presented in fig:unforgeable_scheme is correct if the underlying cryptographic signature scheme is correct.

Figures (8)

  • Figure 1: The three main approaches to content provenance. Metadata-based provenance (top) uses an auxiliary manifest to attach a cryptographic signature and other metadata to the image---signature authentication yields provenance. Watermarking (middle) encodes a payload with provenance information directly into the image itself, and the payload can be decoded thereafter. Retrieval-based detection (bottom) maintains a global store of image embeddings where the store is queried to check if a candidate image is known.
  • Figure 2: Specification of our unforgeable and publicly-detectable watermark. The keys are generated with the generation function of the signature scheme, $sk, pk \gets \mathsf{Generate}(1^\lambda)$. WLOG, the input image $x$ is RGB-encoded. The $\mathsf{Watermark}$ encodes a signature of the image within the image itself such that the output of $\mathsf{Hash}$ does not change. This is achieved by encoding signature bits in the least significant bit of each color channel value---when the hash is applied (i.e., each value is divided by two and floored), its value must be the same as the plain image. Thus, $\mathsf{Detect}$ is able to recover both the hash value and signature bits in order to verify the signature.
  • Figure 3: A robust and publicly-detectable watermark built from a cryptographic signature scheme, a post-hoc watermarking scheme, and a robust embedding model for images. Using a post-hoc watermark, the scheme encodes an embedding of the image along with a signature of the embedding within the image itself. This information can be decoded and verified thereafter.
  • Figure 4: Resistance to $\ell_\infty$ attacks slightly increases with model performance. The y-axis of the right graph is calculated as the area under the corresponding curve in the left graph.
  • Figure 5: Resistance to $\ell_1$ attacks slightly increases with model performance. The y-axis of the right figure is calculated as the area under the corresponding curve in the left figure.
  • ...and 3 more figures

Theorems & Definitions (15)

  • Theorem 4.1: Informal
  • proof
  • Theorem 4.2: Informal
  • proof
  • Theorem 5.1
  • Definition B.1: Robust embedding function
  • Definition B.2: Cryptographic signature scheme
  • Definition B.3: Post-hoc watermarking scheme
  • Definition B.4: Robust publicly-detectable watermarking scheme
  • Theorem B.1
  • ...and 5 more