Table of Contents
Fetching ...

RandMark: On Random Watermarking of Visual Foundation Models

Anna Chistyakova, Mikhail Pautov

TL;DR

This paper proposes an approach to ownership verification of visual foundation models that leverages a small encoder-decoder network to embed digital watermarks into an internal representation of a hold-out set of input images.

Abstract

Being trained on large and diverse datasets, visual foundation models (VFMs) can be fine-tuned to achieve remarkable performance and efficiency in various downstream computer vision tasks. The high computational cost of data collection and training makes these models valuable assets, which motivates some VFM owners to distribute them alongside a license to protect their intellectual property rights. In this paper, we propose an approach to ownership verification of visual foundation models that leverages a small encoder-decoder network to embed digital watermarks into an internal representation of a hold-out set of input images. The method is based on random watermark embedding, which makes the watermark statistics detectable in functional copies of the watermarked model. Both theoretically and experimentally, we demonstrate that the proposed method yields a low probability of false detection for non-watermarked models and a low probability of false misdetection for watermarked models.

RandMark: On Random Watermarking of Visual Foundation Models

TL;DR

This paper proposes an approach to ownership verification of visual foundation models that leverages a small encoder-decoder network to embed digital watermarks into an internal representation of a hold-out set of input images.

Abstract

Being trained on large and diverse datasets, visual foundation models (VFMs) can be fine-tuned to achieve remarkable performance and efficiency in various downstream computer vision tasks. The high computational cost of data collection and training makes these models valuable assets, which motivates some VFM owners to distribute them alongside a license to protect their intellectual property rights. In this paper, we propose an approach to ownership verification of visual foundation models that leverages a small encoder-decoder network to embed digital watermarks into an internal representation of a hold-out set of input images. The method is based on random watermark embedding, which makes the watermark statistics detectable in functional copies of the watermarked model. Both theoretically and experimentally, we demonstrate that the proposed method yields a low probability of false detection for non-watermarked models and a low probability of false misdetection for watermarked models.
Paper Structure (20 sections, 1 theorem, 32 equations, 3 figures, 2 tables)

This paper contains 20 sections, 1 theorem, 32 equations, 3 figures, 2 tables.

Key Result

lemma 1

Let $\delta >0$ and set $\varepsilon = \sqrt{\frac{1}{2n}\ln \left(\frac{1}{\delta}\right)}$. Let $\hat{p} = \frac{1}{n} R_1$ and $\hat{q} = \frac{1}{n} R_2$ be unbiased estimates of $\overline{p}$ and $\overline{q},$ respectively. Then, with probability at least $1-\delta,$ the following upper boun where

Figures (3)

  • Figure 1: Overview of the proposed RandMark watermarking pipeline. A binary message is embedded into a visual foundation model using a set of trigger images and an encoder. During verification, randomized input transformations are applied to the trigger set, and a decoder extracts the watermark message from the model outputs. The extracted messages are then compared with the original watermark to verify model ownership.
  • Figure 2: Watermark detection rate $R$ from \ref{['eq:R']}, averaged over $N=1000$ images used for watermarking. Classification experiments were conducted on the E-commerce Product Images dataset, segmentation experiment was conducted on the FoodSeg103 dataset.
  • Figure 3: Distribution of covariance between decoded watermark messages from two models $f$ and $g$. Independent models produce covariance values near zero, while watermark-dependent models exhibit positive covariance due to correlated decoding of watermark bits.

Theorems & Definitions (4)

  • remark 1
  • lemma 1
  • remark 2
  • proof