Table of Contents
Fetching ...

Suffix-Constrained Greedy Search Algorithms for Causal Language Models

Ayoub Hammal, Pierre Zweigenbaum, Caio Corro

TL;DR

This work introduces suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable.

Abstract

Large language models (LLMs) are powerful tools that have found applications beyond human-machine interfaces and chatbots. In particular, their ability to generate reasoning traces motivated their use in many prediction tasks like math question answering. Unfortunately, extracting the final answer in an LLM free-form output is difficult, as it is an information extraction problem on its own. In this work, we introduce suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable. To this end, we introduce several algorithms that are based on greedy search procedures. We experiment on several datasets, and show that our approach allows to guarantee trivial deterministic extraction of the final answer from an LLM output without having a negative impact on results, and even improving them.

Suffix-Constrained Greedy Search Algorithms for Causal Language Models

TL;DR

This work introduces suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable.

Abstract

Large language models (LLMs) are powerful tools that have found applications beyond human-machine interfaces and chatbots. In particular, their ability to generate reasoning traces motivated their use in many prediction tasks like math question answering. Unfortunately, extracting the final answer in an LLM free-form output is difficult, as it is an information extraction problem on its own. In this work, we introduce suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable. To this end, we introduce several algorithms that are based on greedy search procedures. We experiment on several datasets, and show that our approach allows to guarantee trivial deterministic extraction of the final answer from an LLM output without having a negative impact on results, and even improving them.
Paper Structure (27 sections, 1 theorem, 15 equations, 3 figures, 4 tables, 5 algorithms)

This paper contains 27 sections, 1 theorem, 15 equations, 3 figures, 4 tables, 5 algorithms.

Key Result

Proposition 1

Let $\mathcal{G}$ be a grammar on vocabulary $V$. For any prefix ${\bm{y}} \in \operatorname{pref}({\to}\mathcal{G})$ of the suffix-constrained grammar, we have:

Figures (3)

  • Figure 1: Exemple of a derivation with a suffix-constrained grammar ${\to}\mathcal{G}$. The R non-terminal derive to the reasoning sequence whereas A derives to the constrained answer.
  • Figure 2: Constrained hypothesis beam search with a greedy hypothesis (top) and a constrained hypothesis (bottom). At each generation step, the constrained hypothesis can be replaced (or not) with a new one. For example, after step 1, we choose to continue with the current constrained hypothesis (token ${a}_2$ is added after ${a}_1$). However, after step two, the constrained hypothesis has been replaced with a new one, whose first token is denoted ${a}'_1$. At step 4, the greedy hypothesis is defined as ${\bm{y}} = \langle {y}_1, {y}_2, {y}_3, {y}_4 \rangle$, whereas the constrained one is ${\bm{r}} \odot \langle {a}'_1, {a}'_2\rangle$, where the reasoning part is composed of tokens from the greedy hypothesis before the replacement, that is ${\bm{r}} = \langle {y}_1, {y}_2\rangle$.
  • Figure 3: Average min-entropy difference at each generation step. The difference is between the next-token distribution under the unconstrained hyp. and the second token in a new-constraint hyp., measured on OLMo 2 13B IT.

Theorems & Definitions (5)

  • Definition 1: Prefix
  • Definition 2: Transition operator
  • Definition 3: Suffix-constrained grammar
  • Proposition 1
  • proof