Table of Contents
Fetching ...

RegionMarker: A Region-Triggered Semantic Watermarking Framework for Embedding-as-a-Service Copyright Protection

Shufan Yang, Zifeng Cheng, Zhiwei Jiang, Yafeng Yin, Cong Wang, Shiping Ge, Yuchen Fu, Qing Gu

TL;DR

RegionMarker tackles copyright protection for Embedding-as-a-Service by introducing a region-triggered semantic watermarking approach. It defines trigger regions in a reduced semantic space obtained via dimensionality reduction and uses Locality-Sensitive Hashing to partition space, assigning per-region watermarks and embedding them into text embeddings. The framework includes a robust copyright verification procedure based on backdoor and benign corpora, using cosine similarity, L2 distances, and KS-test p-values to detect infringement. Empirical results across SST-2, AG News, Enron, and MIND demonstrate comprehensive resistance to CSE, paraphrasing, and dimension-perturbation attacks with minimal utility loss and low runtime overhead, yielding practical protection for EaaS providers.

Abstract

Embedding-as-a-Service (EaaS) is an effective and convenient deployment solution for addressing various NLP tasks. Nevertheless, recent research has shown that EaaS is vulnerable to model extraction attacks, which could lead to significant economic losses for model providers. For copyright protection, existing methods inject watermark embeddings into text embeddings and use them to detect copyright infringement. However, current watermarking methods often resist only a subset of attacks and fail to provide \textit{comprehensive} protection. To this end, we present the region-triggered semantic watermarking framework called RegionMarker, which defines trigger regions within a low-dimensional space and injects watermarks into text embeddings associated with these regions. By utilizing a secret dimensionality reduction matrix to project onto this subspace and randomly selecting trigger regions, RegionMarker makes it difficult for watermark removal attacks to evade detection. Furthermore, by embedding watermarks across the entire trigger region and using the text embedding as the watermark, RegionMarker is resilient to both paraphrasing and dimension-perturbation attacks. Extensive experiments on various datasets show that RegionMarker is effective in resisting different attack methods, thereby protecting the copyright of EaaS.

RegionMarker: A Region-Triggered Semantic Watermarking Framework for Embedding-as-a-Service Copyright Protection

TL;DR

RegionMarker tackles copyright protection for Embedding-as-a-Service by introducing a region-triggered semantic watermarking approach. It defines trigger regions in a reduced semantic space obtained via dimensionality reduction and uses Locality-Sensitive Hashing to partition space, assigning per-region watermarks and embedding them into text embeddings. The framework includes a robust copyright verification procedure based on backdoor and benign corpora, using cosine similarity, L2 distances, and KS-test p-values to detect infringement. Empirical results across SST-2, AG News, Enron, and MIND demonstrate comprehensive resistance to CSE, paraphrasing, and dimension-perturbation attacks with minimal utility loss and low runtime overhead, yielding practical protection for EaaS providers.

Abstract

Embedding-as-a-Service (EaaS) is an effective and convenient deployment solution for addressing various NLP tasks. Nevertheless, recent research has shown that EaaS is vulnerable to model extraction attacks, which could lead to significant economic losses for model providers. For copyright protection, existing methods inject watermark embeddings into text embeddings and use them to detect copyright infringement. However, current watermarking methods often resist only a subset of attacks and fail to provide \textit{comprehensive} protection. To this end, we present the region-triggered semantic watermarking framework called RegionMarker, which defines trigger regions within a low-dimensional space and injects watermarks into text embeddings associated with these regions. By utilizing a secret dimensionality reduction matrix to project onto this subspace and randomly selecting trigger regions, RegionMarker makes it difficult for watermark removal attacks to evade detection. Furthermore, by embedding watermarks across the entire trigger region and using the text embedding as the watermark, RegionMarker is resilient to both paraphrasing and dimension-perturbation attacks. Extensive experiments on various datasets show that RegionMarker is effective in resisting different attack methods, thereby protecting the copyright of EaaS.

Paper Structure

This paper contains 41 sections, 6 equations, 16 figures, 13 tables, 2 algorithms.

Figures (16)

  • Figure 1: Motivations of the region-triggered semantic watermarking framework against various attacks.
  • Figure 2: Illustration of the region-triggered semantic watermarking framework.
  • Figure 3: Visualization of the embedding distribution on the SST-2 dataset across 16 randomly and uniformly divided regions: top with dimensionality reduction, bottom without.
  • Figure 4: Impact of the proportion of watermarked regions $\alpha$ under different attacks on the Enron dataset.
  • Figure 5: Impact of dimensionality after PCA under different attacks on the Enron dataset.
  • ...and 11 more figures