Table of Contents
Fetching ...

Controllable Generation via Locally Constrained Resampling

Kareem Ahmed, Kai-Wei Chang, Guy Van den Broeck

TL;DR

By disallowing a list of toxic expressions the approach is able to steer the model's outputs away from toxic generations, outperforming similar approaches to detoxification and achieving a perfect accuracy on Sudoku.

Abstract

Autoregressive models have demonstrated an unprecedented ability at modeling the intricacies of natural language. However, they continue to struggle with generating complex outputs that adhere to logical constraints. Sampling from a fully-independent distribution subject to a constraint is hard. Sampling from an autoregressive distribution subject to a constraint is doubly hard: We have to contend not only with the hardness of the constraint but also the distribution's lack of structure. We propose a tractable probabilistic approach that performs Bayesian conditioning to draw samples subject to a constraint. Our approach considers the entire sequence, leading to a more globally optimal constrained generation than current greedy methods. Starting from a model sample, we induce a local, factorized distribution which we can tractably condition on the constraint. To generate samples that satisfy the constraint, we sample from the conditional distribution, correct for biases in the samples and resample. The resulting samples closely approximate the target distribution and are guaranteed to satisfy the constraints. We evaluate our approach on several tasks, including LLM detoxification and solving Sudoku puzzles. We show that by disallowing a list of toxic expressions our approach is able to steer the model's outputs away from toxic generations, outperforming similar approaches to detoxification. We conclude by showing that our approach achieves a perfect accuracy on Sudoku compared to <50% for GPT4-o and Gemini 1.5.

Controllable Generation via Locally Constrained Resampling

TL;DR

By disallowing a list of toxic expressions the approach is able to steer the model's outputs away from toxic generations, outperforming similar approaches to detoxification and achieving a perfect accuracy on Sudoku.

Abstract

Autoregressive models have demonstrated an unprecedented ability at modeling the intricacies of natural language. However, they continue to struggle with generating complex outputs that adhere to logical constraints. Sampling from a fully-independent distribution subject to a constraint is hard. Sampling from an autoregressive distribution subject to a constraint is doubly hard: We have to contend not only with the hardness of the constraint but also the distribution's lack of structure. We propose a tractable probabilistic approach that performs Bayesian conditioning to draw samples subject to a constraint. Our approach considers the entire sequence, leading to a more globally optimal constrained generation than current greedy methods. Starting from a model sample, we induce a local, factorized distribution which we can tractably condition on the constraint. To generate samples that satisfy the constraint, we sample from the conditional distribution, correct for biases in the samples and resample. The resulting samples closely approximate the target distribution and are guaranteed to satisfy the constraints. We evaluate our approach on several tasks, including LLM detoxification and solving Sudoku puzzles. We show that by disallowing a list of toxic expressions our approach is able to steer the model's outputs away from toxic generations, outperforming similar approaches to detoxification. We conclude by showing that our approach achieves a perfect accuracy on Sudoku compared to <50% for GPT4-o and Gemini 1.5.

Paper Structure

This paper contains 21 sections, 2 theorems, 12 equations, 2 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

Locally Constrained Resampling in algo:conditional returns a sample $\boldsymbol{y}$ s.t. $\boldsymbol{y} \models \alpha$.

Figures (2)

  • Figure 1: An illustration of our proposed approach. (left) An LLM induces a distribution over all possible sentences. Autoregressively sampling from the LLM distribution, we obtain a sentence () $\Tilde{\boldsymbol{y}} =$[He's, ␣ full, ␣ of, ␣ sh!t]. This sentence $\Tilde{\boldsymbol{y}}$ violates a constraint $\alpha$ that disallows toxic words, including the word "sh!t". The subset of sentences that satisfy the constraint $\alpha$ () are denoted by $\vdash m_\alpha\dashv$. (center) The sentence $\Tilde{\boldsymbol{y}}$ induces a local, tractable approximation of the true distribution centered around $\Tilde{\boldsymbol{y}}$. (right) We can efficiently condition this tractable approximation on the constraint $\alpha$, trimming away portions of its support that do not satisfy the constraint. Sampling from the LLM distribution subject to the constraint $\alpha$ then corresponds to sampling from the conditional approximate distribution and adjusting the sample weights using importance weighting. This yields a sentence () $\boldsymbol{y} =$[He's, ␣ full, time, ␣ employed] that satisfies $\alpha$.
  • Figure 2: Constructing and sampling the proposal distribution. (Top left) We start by sampling a sentence $\Tilde{\boldsymbol{y}}$ from the model $p$. Our goal is to compute the full conditional probability of every word in the vocab i.e., $\Tilde{p}(\Tilde{y}_{ij}) \coloneqq p(\Tilde{y}_{ij} \mid \Tilde{\boldsymbol{y}}_{-i})$. We start by expanding the sampled sentence $\Tilde{\boldsymbol{y}}$, including all sentences that are a Hamming distance of $1$ away from the sample $\boldsymbol{y}$. We proceed by (batch) evaluating the samples through the model, obtaining the joint probability of each sample. We then normalize along each column, obtaining the conditionals $p(\Tilde{y}_{ij})$. (Bottom left) We can easily specify a logical constraint that prevents the word " hate" from appearing as a simple python function that gets compiled in the constraint circuit on the right. (Center) A logical circuit encoding constraint $\lnot\text{hates}$, with a simplified vocab is shown in the figure. To construct the distribution $p_{\Tilde{\boldsymbol{y}}}(\boldsymbol{y}\mid\alpha)$, we feed the computed contextual probabilities at the corresponding literals. We push the probabilities upwards, taking products at AND nodes and sums at OR nodes. This induces a distribution $p_{\Tilde{\boldsymbol{y}}}(\boldsymbol{y}\mid\alpha)$. (Right) To sample a from this distribution, we start at the root of the circuit, sampling a child of every OR gate according to the logits of the distribution, and concatenating at every AND gate. In this case, we sample the sentence "He loves dogs" satisfying the constraint.

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 1
  • proof