Table of Contents
Fetching ...

Improved Pseudorandom Codes from Permuted Puzzles

Miranda Christ, Noah Golowich, Sam Gunn, Ankur Moitra, Daniel Wichs

TL;DR

The paper advances watermarking for AI-generated content by introducing pseudorandom codes (PRCs) that are simultaneously subexponentially secure, robust to edits on binary alphabets, and resilient even when an adversary knows the watermark key. It builds these PRCs on the permuted-codes conjecture, linking to the permuted-puzzles framework and providing both statistical and cryptanalytic evidence supporting the approach. A key technical leap is using folded Reed-Solomon codes to achieve edit-robustness while keeping the watermark at a low per-token entropy, enabling practical LLM watermarking. The work further shows how to convert these PRCs into watermarking schemes with soundness, undetectability, and strong robustness against edit-bounded channels, and discusses concrete code families (AG codes, RS/FRS) as viable bases under the conjecture. This combination of cryptographic hardness assumptions, list-recovery-based decoding, and folding techniques yields a new pathway to robust, covert watermarking suitable for language models and related models.

Abstract

Watermarks are an essential tool for identifying AI-generated content. Recently, Christ and Gunn (CRYPTO '24) introduced pseudorandom error-correcting codes (PRCs), which are equivalent to watermarks with strong robustness and quality guarantees. A PRC is a pseudorandom encryption scheme whose decryption algorithm tolerates a high rate of errors. Pseudorandomness ensures quality preservation of the watermark, and error tolerance of decryption translates to the watermark's ability to withstand modification of the content. In the short time since the introduction of PRCs, several works (NeurIPS '24, RANDOM '25, STOC '25) have proposed new constructions. Curiously, all of these constructions are vulnerable to quasipolynomial-time distinguishing attacks. Furthermore, all lack robustness to edits over a constant-sized alphabet, which is necessary for a meaningfully robust LLM watermark. Lastly, they lack robustness to adversaries who know the watermarking detection key. Until now, it was not clear whether any of these properties was achievable individually, let alone together. We construct pseudorandom codes that achieve all of the above: plausible subexponential pseudorandomness security, robustness to worst-case edits over a binary alphabet, and robustness against even computationally unbounded adversaries that have the detection key. Pseudorandomness rests on a new assumption that we formalize, the permuted codes conjecture, which states that a distribution of permuted noisy codewords is pseudorandom. We show that this conjecture is implied by the permuted puzzles conjecture used previously to construct doubly efficient private information retrieval. To give further evidence, we show that the conjecture holds against a broad class of simple distinguishers, including read-once branching programs.

Improved Pseudorandom Codes from Permuted Puzzles

TL;DR

The paper advances watermarking for AI-generated content by introducing pseudorandom codes (PRCs) that are simultaneously subexponentially secure, robust to edits on binary alphabets, and resilient even when an adversary knows the watermark key. It builds these PRCs on the permuted-codes conjecture, linking to the permuted-puzzles framework and providing both statistical and cryptanalytic evidence supporting the approach. A key technical leap is using folded Reed-Solomon codes to achieve edit-robustness while keeping the watermark at a low per-token entropy, enabling practical LLM watermarking. The work further shows how to convert these PRCs into watermarking schemes with soundness, undetectability, and strong robustness against edit-bounded channels, and discusses concrete code families (AG codes, RS/FRS) as viable bases under the conjecture. This combination of cryptographic hardness assumptions, list-recovery-based decoding, and folding techniques yields a new pathway to robust, covert watermarking suitable for language models and related models.

Abstract

Watermarks are an essential tool for identifying AI-generated content. Recently, Christ and Gunn (CRYPTO '24) introduced pseudorandom error-correcting codes (PRCs), which are equivalent to watermarks with strong robustness and quality guarantees. A PRC is a pseudorandom encryption scheme whose decryption algorithm tolerates a high rate of errors. Pseudorandomness ensures quality preservation of the watermark, and error tolerance of decryption translates to the watermark's ability to withstand modification of the content. In the short time since the introduction of PRCs, several works (NeurIPS '24, RANDOM '25, STOC '25) have proposed new constructions. Curiously, all of these constructions are vulnerable to quasipolynomial-time distinguishing attacks. Furthermore, all lack robustness to edits over a constant-sized alphabet, which is necessary for a meaningfully robust LLM watermark. Lastly, they lack robustness to adversaries who know the watermarking detection key. Until now, it was not clear whether any of these properties was achievable individually, let alone together. We construct pseudorandom codes that achieve all of the above: plausible subexponential pseudorandomness security, robustness to worst-case edits over a binary alphabet, and robustness against even computationally unbounded adversaries that have the detection key. Pseudorandomness rests on a new assumption that we formalize, the permuted codes conjecture, which states that a distribution of permuted noisy codewords is pseudorandom. We show that this conjecture is implied by the permuted puzzles conjecture used previously to construct doubly efficient private information retrieval. To give further evidence, we show that the conjecture holds against a broad class of simple distinguishers, including read-once branching programs.

Paper Structure

This paper contains 43 sections, 33 theorems, 62 equations, 1 table, 3 algorithms.

Key Result

Theorem 1

The "permuted puzzles" conjecture boyle2021securityblackwell2021note implies the permuted codes conjecture.

Theorems & Definitions (77)

  • Conjecture : Permuted codes conjecture, \ref{['conj:permuted-codes']}
  • Theorem : Evidence for the permuted codes conjecture, \ref{['thm:puzzles-from-codes']}
  • Theorem : Statistical uniformity of a few codewords, \ref{['cor:stat-evidence']}
  • Theorem : An improved PRC from permuted codes, \ref{['thm:prc-main']}
  • Theorem : An improved watermark, \ref{['thm:watermarking-main']}
  • Definition 2.1
  • Definition 3.1: Permuted codes assumption
  • Conjecture 3.1: General permuted codes conjecture
  • Conjecture 3.2: Permuted Reed-Solomon conjecture
  • Conjecture 3.3: Permuted puzzles, "General conjecture" boyle2021securityblackwell2021note
  • ...and 67 more