Table of Contents
Fetching ...

Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs

Xuandong Zhao, Lei Li, Yu-Xiang Wang

TL;DR

The paper introduces Permute-and-Flip (PF) decoding for LLMs, proving a stability guarantee equivalent to the softmax sampling baseline while achieving a lower expected suboptimality, thereby improving perplexity without sacrificing diversity. It also devises a PF-specific watermarking scheme within a Report-Noisy-Max framework, enabling detectable, controllable watermarks that are computationally indistinguishable from non-watermarked PF decoding. The authors demonstrate through open-generation experiments on C4 and Alpaca with Llama-2-7B (and TinyLlama) that PF decoding reduces perplexity and that PF watermarks achieve strong detection performance with low false-positive rates. The work combines theoretical stability results with practical watermarking guarantees, supporting broader adoption of PF decoding in real-world LLM applications and secure AI deployment.

Abstract

In this paper, we propose a new decoding method called Permute-and-Flip (PF) decoder. It enjoys stability properties similar to the standard sampling decoder, but is provably up to 2x better in its quality-stability tradeoff than sampling and never worse than any other decoder. We also design a cryptographic watermarking scheme analogous to Aaronson (2023)'s Gumbel watermark, but naturally tailored for PF decoder. The watermarking scheme does not change the distribution to sample, while allowing arbitrarily low false positive rate and high recall whenever the generated text has high entropy. Our experiments show that the PF decoder (and its watermarked counterpart) significantly outperform(s) naive sampling (and its Gumbel watermarked counterpart) in terms of perplexity, while retaining the same stability (and detectability), hence making it a promising new approach for LLM decoding. The code is available at https://github.com/XuandongZhao/pf-decoding

Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs

TL;DR

The paper introduces Permute-and-Flip (PF) decoding for LLMs, proving a stability guarantee equivalent to the softmax sampling baseline while achieving a lower expected suboptimality, thereby improving perplexity without sacrificing diversity. It also devises a PF-specific watermarking scheme within a Report-Noisy-Max framework, enabling detectable, controllable watermarks that are computationally indistinguishable from non-watermarked PF decoding. The authors demonstrate through open-generation experiments on C4 and Alpaca with Llama-2-7B (and TinyLlama) that PF decoding reduces perplexity and that PF watermarks achieve strong detection performance with low false-positive rates. The work combines theoretical stability results with practical watermarking guarantees, supporting broader adoption of PF decoding in real-world LLM applications and secure AI deployment.

Abstract

In this paper, we propose a new decoding method called Permute-and-Flip (PF) decoder. It enjoys stability properties similar to the standard sampling decoder, but is provably up to 2x better in its quality-stability tradeoff than sampling and never worse than any other decoder. We also design a cryptographic watermarking scheme analogous to Aaronson (2023)'s Gumbel watermark, but naturally tailored for PF decoder. The watermarking scheme does not change the distribution to sample, while allowing arbitrarily low false positive rate and high recall whenever the generated text has high entropy. Our experiments show that the PF decoder (and its watermarked counterpart) significantly outperform(s) naive sampling (and its Gumbel watermarked counterpart) in terms of perplexity, while retaining the same stability (and detectability), hence making it a promising new approach for LLM decoding. The code is available at https://github.com/XuandongZhao/pf-decoding
Paper Structure (34 sections, 5 theorems, 45 equations, 4 figures, 8 tables, 3 algorithms)

This paper contains 34 sections, 5 theorems, 45 equations, 4 figures, 8 tables, 3 algorithms.

Key Result

Theorem 3.1

Let the logits function be ${u}$ and ${u}^* = \max_{y\in\mathcal{V}} {u}(y)$. Let $\mathrm{PF}({u})$ be the distribution of PF-sampling, and $\mathrm{Softmax}({u})$ be the distribution in eq:sampling, both with temperature parameter $T$. The following statements are true.

Figures (4)

  • Figure 1: Comparing PF decoder vs Softmax decoder using Example \ref{['ex:twotoken']}.
  • Figure 2: Comparing the detectability of PF watermark vs Gumbel watermark using Example \ref{['ex:twotoken_watermark']}.
  • Figure 3: Comparison of PF and Gumbel watermarks on real data.
  • Figure 4: Comparison of empirical and theoretical false positive rates with different watermark keys. We can see that the second statement of Theorem 4.3 correctly controls the Type I error in practice.

Theorems & Definitions (23)

  • Definition 2.1: Stability
  • proof
  • Theorem 3.1
  • proof
  • Example 3.2
  • Theorem 4.3
  • Example 4.4
  • Example 4.5
  • Remark B.1: Stability implies diversity
  • proof : Proof of Example \ref{['ex:twotoken']}
  • ...and 13 more