Table of Contents
Fetching ...

ArcMark: Multi-bit LLM Watermark via Optimal Transport

Atefeh Gilani, Carol Xuan Long, Sajani Vithana, Oliver Kosut, Lalitha Sankar, Flavio P. Calmon

TL;DR

This work reframes multi-bit LLM watermarking as a channel coding problem with side information, deriving the first capacity characterization for distortion-free watermarking. It introduces ArcMark, a distortion-free watermarking scheme based on random linear coding and optimal transport that maps message symbols to angular coordinates on a circle and jointly encodes across token sequences. The approach achieves capacity in a simple i.i.d. token-distribution setting and empirically outperforms state-of-the-art methods in message accuracy while preserving perplexity across multiple models and embedding lengths. The results demonstrate that principled coding-theoretic design can push watermarking rates higher without compromising text quality, paving the way for broader capacity-driven watermarking frameworks.

Abstract

Watermarking is an important tool for promoting the responsible use of language models (LMs). Existing watermarks insert a signal into generated tokens that either flags LM-generated text (zero-bit watermarking) or encodes more complex messages (multi-bit watermarking). Though a number of recent multi-bit watermarks insert several bits into text without perturbing average next-token predictions, they largely extend design principles from the zero-bit setting, such as encoding a single bit per token. Notably, the information-theoretic capacity of multi-bit watermarking -- the maximum number of bits per token that can be inserted and detected without changing average next-token predictions -- has remained unknown. We address this gap by deriving the first capacity characterization of multi-bit watermarks. Our results inform the design of ArcMark: a new watermark construction based on coding-theoretic principles that, under certain assumptions, achieves the capacity of the multi-bit watermark channel. In practice, ArcMark outperforms competing multi-bit watermarks in terms of bit rate per token and detection accuracy. Our work demonstrates that LM watermarking is fundamentally a channel coding problem, paving the way for principled coding-theoretic approaches to watermark design.

ArcMark: Multi-bit LLM Watermark via Optimal Transport

TL;DR

This work reframes multi-bit LLM watermarking as a channel coding problem with side information, deriving the first capacity characterization for distortion-free watermarking. It introduces ArcMark, a distortion-free watermarking scheme based on random linear coding and optimal transport that maps message symbols to angular coordinates on a circle and jointly encodes across token sequences. The approach achieves capacity in a simple i.i.d. token-distribution setting and empirically outperforms state-of-the-art methods in message accuracy while preserving perplexity across multiple models and embedding lengths. The results demonstrate that principled coding-theoretic design can push watermarking rates higher without compromising text quality, paving the way for broader capacity-driven watermarking frameworks.

Abstract

Watermarking is an important tool for promoting the responsible use of language models (LMs). Existing watermarks insert a signal into generated tokens that either flags LM-generated text (zero-bit watermarking) or encodes more complex messages (multi-bit watermarking). Though a number of recent multi-bit watermarks insert several bits into text without perturbing average next-token predictions, they largely extend design principles from the zero-bit setting, such as encoding a single bit per token. Notably, the information-theoretic capacity of multi-bit watermarking -- the maximum number of bits per token that can be inserted and detected without changing average next-token predictions -- has remained unknown. We address this gap by deriving the first capacity characterization of multi-bit watermarks. Our results inform the design of ArcMark: a new watermark construction based on coding-theoretic principles that, under certain assumptions, achieves the capacity of the multi-bit watermark channel. In practice, ArcMark outperforms competing multi-bit watermarks in terms of bit rate per token and detection accuracy. Our work demonstrates that LM watermarking is fundamentally a channel coding problem, paving the way for principled coding-theoretic approaches to watermark design.
Paper Structure (17 sections, 3 theorems, 53 equations, 6 figures, 1 table)

This paper contains 17 sections, 3 theorems, 53 equations, 6 figures, 1 table.

Key Result

Theorem 3.1

The watermarking capacity is given by where $(W,Q) \sim P_W(w)P_Q(q)$, $X=x(W,Q)\in\mathcal{X}$, $P_W$ is a distribution on an arbitrary alphabet $\mathcal{W}$, and $Q$ is the random variable representing the token distribution from the simplex.

Figures (6)

  • Figure 1: Message accuracy on Llama3-8B for different watermark embedding lengths. The top row shows results for 3-bit (left) and 8-bit (right) watermarks, while the bottom panel shows results for 16-bit watermarks. ArcMark outperforms state-of-the-art multi-bit watermarking methods, with the performance gap increasing for longer embeddings, where exact recovery of all message bits becomes more challenging
  • Figure 2: Overview of ArcMark: The $k$-bit watermarking message (for example 001 in the figure) is mapped to a block of $n$ tokens using a random linear code, defined by the generator matrix $G$, that generates a collection of codewords $\{{C_t}\}_{t=1}^n$. Each $C_t$ is mapped to one of $p$-equally points on the circle. Additionally, a shared randomly generated point $v_t$ which takes $r$ equally spaced values around the circle, is generated and a linear combination of $C_t$ and $v_t$ comprise the input angle $z_t$. Each token from the vocabulary is randomly assigned a point on a circle as shown in the figure using the permutation $\Pi_t$. The generated token $x_t$ is then selected by solving an optimal transport problem, minimizing the distance between the point on the circle representing the message and the point representing the token, while matching the token distribution from the LLM.
  • Figure 3: Message accuracy on Llama2-7B for different watermark embedding lengths. The top row shows results for 3-bit (left) and 8-bit (right) watermarks, while the bottom panel shows results for 16-bit watermarks. As the embedding length increases, the gap in message accuracy between ArcMark and prior methods becomes more pronounced, reflecting ArcMark’s emphasis on message-level recovery, which is particularly important for longer embedded messages where even a single bit error leads to message failure
  • Figure 4: Bit accuracy results: the left column corresponds to Llama2-7B and the right column to Llama3-8B. The top, middle, and bottom rows represent watermark embedding lengths of 3, 8, and 16 bits, respectively. Across models and embedding lengths, ArcMark achieves competitive bit accuracy, indicating that optimizing for message-level recovery does not come at the expense of bit-level performance
  • Figure 5: Perplexity of watermarked text generated using ArcMark on Llama2-7B (left) and Llama3-8B (right) for different watermark embedding lengths (3, 8, and 16 bits). Across all settings, perplexity remains comparable to unwatermarked generation, indicating minimal impact of ArcMark on the likelihood assigned to the text by the underlying language model
  • ...and 1 more figures

Theorems & Definitions (6)

  • Theorem 3.1
  • Corollary 3.2
  • Remark 3.3
  • Theorem 4.1
  • Claim B.1
  • proof : Proof of Claim \ref{['cl1']}