Table of Contents
Fetching ...

Robustness Assessment and Enhancement of Text Watermarking for Google's SynthID

Xia Han, Qi Li, Jianbing Ni, Mohammad Zulkernine

TL;DR

Experimental results across multiple attack scenarios show that SynGuard improves watermark recovery by an average of 11.1\% in F1 score compared to SynthID-Text, demonstrating the effectiveness of semantic-aware watermarking in resisting real-world tampering.

Abstract

Recent advances in LLM watermarking methods such as SynthID-Text by Google DeepMind offer promising solutions for tracing the provenance of AI-generated text. However, our robustness assessment reveals that SynthID-Text is vulnerable to meaning-preserving attacks, such as paraphrasing, copy-paste modifications, and back-translation, which can significantly degrade watermark detectability. To address these limitations, we propose SynGuard, a hybrid framework that combines the semantic alignment strength of Semantic Information Retrieval (SIR) with the probabilistic watermarking mechanism of SynthID-Text. Our approach jointly embeds watermarks at both lexical and semantic levels, enabling robust provenance tracking while preserving the original meaning. Experimental results across multiple attack scenarios show that SynGuard improves watermark recovery by an average of 11.1\% in F1 score compared to SynthID-Text. These findings demonstrate the effectiveness of semantic-aware watermarking in resisting real-world tampering. All code, datasets, and evaluation scripts are publicly available at: https://github.com/githshine/SynGuard.

Robustness Assessment and Enhancement of Text Watermarking for Google's SynthID

TL;DR

Experimental results across multiple attack scenarios show that SynGuard improves watermark recovery by an average of 11.1\% in F1 score compared to SynthID-Text, demonstrating the effectiveness of semantic-aware watermarking in resisting real-world tampering.

Abstract

Recent advances in LLM watermarking methods such as SynthID-Text by Google DeepMind offer promising solutions for tracing the provenance of AI-generated text. However, our robustness assessment reveals that SynthID-Text is vulnerable to meaning-preserving attacks, such as paraphrasing, copy-paste modifications, and back-translation, which can significantly degrade watermark detectability. To address these limitations, we propose SynGuard, a hybrid framework that combines the semantic alignment strength of Semantic Information Retrieval (SIR) with the probabilistic watermarking mechanism of SynthID-Text. Our approach jointly embeds watermarks at both lexical and semantic levels, enabling robust provenance tracking while preserving the original meaning. Experimental results across multiple attack scenarios show that SynGuard improves watermark recovery by an average of 11.1\% in F1 score compared to SynthID-Text. These findings demonstrate the effectiveness of semantic-aware watermarking in resisting real-world tampering. All code, datasets, and evaluation scripts are publicly available at: https://github.com/githshine/SynGuard.

Paper Structure

This paper contains 28 sections, 2 theorems, 6 equations, 13 figures, 12 tables, 1 algorithm.

Key Result

Theorem 1

Let $t = [t_0, t_1, \ldots, t_T]$ be a watermarked text and $t'$ be a meaning-preserving transformation of $t$. Then, with high probability, the watermark detection score $s(t')$ remains above detection threshold $\tau$, i.e., the watermark is still detectable.

Figures (13)

  • Figure 1: ROC curves of SynthID-Text under synonym substitution attacks with varying replacement ratios.
  • Figure 2: ROC curves under different copy-and-paste attack ratios. The blue curve represents the original SynthID-Text ROC curve without attack; the gray curve indicates random guessing. Other curves depict results under varying ratios, where the ratio denotes how many times longer the inserted natural text is compared to the original watermarked text.
  • Figure 3: ROC curves under paraphrasing attacks with different settings.
  • Figure 4: Illustration of watermark dilution through translation
  • Figure 5: ROC curves of re-translation attacks on SynthID
  • ...and 8 more figures

Theorems & Definitions (4)

  • Theorem 1
  • proof
  • Theorem 2
  • proof