Table of Contents
Fetching ...

ChopChop: a Programmable Framework for Semantically Constraining the Output of Language Models

Shaan Nagy, Timothy Zhou, Nadia Polikarpova, Loris D'Antoni

TL;DR

ChopChop introduces a programmable framework for semantic constrained decoding, enabling language models to generate code that satisfies user-defined semantic properties by constraining the abstract syntax trees rather than surface strings. It formalizes completability as realizability over spaces of ASTs represented as regular coterms, and uses coinductive, derivative-based parsers to compute prefix spaces, which are then pruned by user-defined semantic pruners. The approach is instantiated in two challenging domains—equivalence-guided decoding using e-graphs and type-safe decoding for a TypeScript subset—demonstrating improved correctness and success rates with manageable overhead. Together, ChopChop bridges formal methods and language-model outputs, offering a flexible, programmable path to reliable code generation and outlining avenues for efficiency and broader semantic constraints.

Abstract

Language models (LMs) can generate code but cannot guarantee its correctness$\unicode{x2014}$often producing outputs that violate type safety, program invariants, or other semantic properties. Constrained decoding offers a solution by restricting generation to only produce programs that satisfy user-defined properties. However, existing methods are either limited to syntactic constraints or rely on brittle, ad hoc encodings of semantic properties over token sequences rather than program structure. We present ChopChop, the first programmable framework for constraining the output of LMs with respect to semantic properties. ChopChop introduces a principled way to construct constrained decoders based on analyzing the space of programs a prefix represents. It formulates this analysis as a realizability problem which is solved via coinduction, connecting token-level generation with structural reasoning over programs. We demonstrate ChopChop's generality by using it to enforce (1) equivalence to a reference program and (2) type safety. Across a range of models and tasks, ChopChop improves success rates while maintaining practical decoding latency.

ChopChop: a Programmable Framework for Semantically Constraining the Output of Language Models

TL;DR

ChopChop introduces a programmable framework for semantic constrained decoding, enabling language models to generate code that satisfies user-defined semantic properties by constraining the abstract syntax trees rather than surface strings. It formalizes completability as realizability over spaces of ASTs represented as regular coterms, and uses coinductive, derivative-based parsers to compute prefix spaces, which are then pruned by user-defined semantic pruners. The approach is instantiated in two challenging domains—equivalence-guided decoding using e-graphs and type-safe decoding for a TypeScript subset—demonstrating improved correctness and success rates with manageable overhead. Together, ChopChop bridges formal methods and language-model outputs, offering a flexible, programmable path to reliable code generation and outlining avenues for efficiency and broader semantic constraints.

Abstract

Language models (LMs) can generate code but cannot guarantee its correctnessoften producing outputs that violate type safety, program invariants, or other semantic properties. Constrained decoding offers a solution by restricting generation to only produce programs that satisfy user-defined properties. However, existing methods are either limited to syntactic constraints or rely on brittle, ad hoc encodings of semantic properties over token sequences rather than program structure. We present ChopChop, the first programmable framework for constraining the output of LMs with respect to semantic properties. ChopChop introduces a principled way to construct constrained decoders based on analyzing the space of programs a prefix represents. It formulates this analysis as a realizability problem which is solved via coinduction, connecting token-level generation with structural reasoning over programs. We demonstrate ChopChop's generality by using it to enforce (1) equivalence to a reference program and (2) type safety. Across a range of models and tasks, ChopChop improves success rates while maintaining practical decoding latency.

Paper Structure

This paper contains 66 sections, 5 theorems, 14 equations, 18 figures, 3 tables, 2 algorithms.

Key Result

lemma 1

Constrained decoding is sound if $\Phi(\omega)$ can be checked soundly.

Figures (18)

  • Figure 1: A left-recursive grammar for the language of integer sums. Right-hand annotations specify parse tree-to-AST translation (e.g., $\{\text{Sum \$1 \$3}\}$ applies the Sum constructor to the first and third arguments of the second production).
  • Figure 2: Flow of ChopChop on input prefix . The prefix is lexed into possible lexical sequences, parsed into a symbolic program space, semantically pruned, and checked for nonemptiness to determine realizability. If realizble, the prefix may be extended. If unrealizable, the prefix is discarded.
  • Figure 3: Example semantic pruners.
  • Figure 4: Constrained Decoding
  • Figure 5: Structure of the stream.
  • ...and 13 more figures

Theorems & Definitions (5)

  • lemma 1: Soundness
  • lemma 2: Completeness up to Top-$p$ and Top-$k$ Sampling
  • theorem 1: Soundness of Semantic Constrained Decoding
  • theorem 2: Completeness up to Top-$p$ and Top-$k$ Sampling of Semantic Constrained Decoding
  • theorem 3: Correctness of Partial Lexing