Suffix-Constrained Greedy Search Algorithms for Causal Language Models

Ayoub Hammal; Pierre Zweigenbaum; Caio Corro

Suffix-Constrained Greedy Search Algorithms for Causal Language Models

Ayoub Hammal, Pierre Zweigenbaum, Caio Corro

TL;DR

This work introduces suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable.

Abstract

Large language models (LLMs) are powerful tools that have found applications beyond human-machine interfaces and chatbots. In particular, their ability to generate reasoning traces motivated their use in many prediction tasks like math question answering. Unfortunately, extracting the final answer in an LLM free-form output is difficult, as it is an information extraction problem on its own. In this work, we introduce suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable. To this end, we introduce several algorithms that are based on greedy search procedures. We experiment on several datasets, and show that our approach allows to guarantee trivial deterministic extraction of the final answer from an LLM output without having a negative impact on results, and even improving them.

Suffix-Constrained Greedy Search Algorithms for Causal Language Models

TL;DR

This work introduces suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable.

Abstract

Paper Structure (27 sections, 1 theorem, 15 equations, 3 figures, 4 tables, 5 algorithms)

This paper contains 27 sections, 1 theorem, 15 equations, 3 figures, 4 tables, 5 algorithms.

Introduction
Notations.
Related work
Constrained Generation.
Generation with Reasoning.
Background
Greedy Search
Constrained Generation
Suffix-Constrained Generation
Issue with Suffix-Constrained Grammars
Greedy Pipeline
Beam-Based Algorithms
Constrained Hypothesis Beam Search
Bifurcation Position Penalty
Scoring via Bifurcation Penalty.
...and 12 more sections

Key Result

Proposition 1

Let $\mathcal{G}$ be a grammar on vocabulary $V$. For any prefix ${\bm{y}} \in \operatorname{pref}({\to}\mathcal{G})$ of the suffix-constrained grammar, we have:

Figures (3)

Figure 1: Exemple of a derivation with a suffix-constrained grammar ${\to}\mathcal{G}$. The R non-terminal derive to the reasoning sequence whereas A derives to the constrained answer.
Figure 2: Constrained hypothesis beam search with a greedy hypothesis (top) and a constrained hypothesis (bottom). At each generation step, the constrained hypothesis can be replaced (or not) with a new one. For example, after step 1, we choose to continue with the current constrained hypothesis (token ${a}_2$ is added after ${a}_1$). However, after step two, the constrained hypothesis has been replaced with a new one, whose first token is denoted ${a}'_1$. At step 4, the greedy hypothesis is defined as ${\bm{y}} = \langle {y}_1, {y}_2, {y}_3, {y}_4 \rangle$, whereas the constrained one is ${\bm{r}} \odot \langle {a}'_1, {a}'_2\rangle$, where the reasoning part is composed of tokens from the greedy hypothesis before the replacement, that is ${\bm{r}} = \langle {y}_1, {y}_2\rangle$.
Figure 3: Average min-entropy difference at each generation step. The difference is between the next-token distribution under the unconstrained hyp. and the second token in a new-constraint hyp., measured on OLMo 2 13B IT.

Theorems & Definitions (5)

Definition 1: Prefix
Definition 2: Transition operator
Definition 3: Suffix-constrained grammar
Proposition 1
proof

Suffix-Constrained Greedy Search Algorithms for Causal Language Models

TL;DR

Abstract

Suffix-Constrained Greedy Search Algorithms for Causal Language Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (5)