Table of Contents
Fetching ...

Controlled LLM Decoding via Discrete Auto-regressive Biasing

Patrick Pynadath, Ruqi Zhang

TL;DR

This work introduces Discrete Auto-Regressive Biasing (DAB), a decoding framework that performs gradient-guided biasing entirely in the discrete token space by modeling a joint distribution over generated sequences $Y$ and bias tokens $B$. By alternating between sampling $B|X,Y$ via a discrete Langevin step and generating $Y|X,B$ through biased autoregression, DAB achieves better balance between constraint satisfaction and fluency with lower decoding cost than prior energy-based methods. Across sentiment control, toxicity avoidance, and keyword-guided generation, DAB yields stronger constraint satisfaction while preserving fluency metrics close to or better than baselines, and it demonstrates faster decoding (2x speed) due to simpler gradient computations. The approach offers a flexible, inference-time mechanism for steering LLM outputs to meet external constraints without fine-tuning, with notable implications for safety, controllability, and efficiency in real-world text generation.

Abstract

Controlled text generation allows for enforcing user-defined constraints on large language model outputs, an increasingly important field as LLMs become more prevalent in everyday life. One common approach uses energy-based decoding, which defines a target distribution through an energy function that combines multiple constraints into a weighted average. However, these methods often struggle to balance fluency with constraint satisfaction, even with extensive tuning of the energy function's coefficients. In this paper, we identify that this suboptimal balance arises from sampling in continuous space rather than the natural discrete space of text tokens. To address this, we propose Discrete Auto-regressive Biasing, a controlled decoding algorithm that leverages gradients while operating entirely in the discrete text domain. Specifically, we introduce a new formulation for controlled text generation by defining a joint distribution over the generated sequence and an auxiliary bias sequence. To efficiently sample from this joint distribution, we propose a Langevin-within-Gibbs sampling algorithm using gradient-based discrete MCMC. Our method significantly improves constraint satisfaction while maintaining comparable or better fluency, all with even lower computational costs. We demonstrate the advantages of our controlled decoding method on sentiment control, language detoxification, and keyword-guided generation.

Controlled LLM Decoding via Discrete Auto-regressive Biasing

TL;DR

This work introduces Discrete Auto-Regressive Biasing (DAB), a decoding framework that performs gradient-guided biasing entirely in the discrete token space by modeling a joint distribution over generated sequences and bias tokens . By alternating between sampling via a discrete Langevin step and generating through biased autoregression, DAB achieves better balance between constraint satisfaction and fluency with lower decoding cost than prior energy-based methods. Across sentiment control, toxicity avoidance, and keyword-guided generation, DAB yields stronger constraint satisfaction while preserving fluency metrics close to or better than baselines, and it demonstrates faster decoding (2x speed) due to simpler gradient computations. The approach offers a flexible, inference-time mechanism for steering LLM outputs to meet external constraints without fine-tuning, with notable implications for safety, controllability, and efficiency in real-world text generation.

Abstract

Controlled text generation allows for enforcing user-defined constraints on large language model outputs, an increasingly important field as LLMs become more prevalent in everyday life. One common approach uses energy-based decoding, which defines a target distribution through an energy function that combines multiple constraints into a weighted average. However, these methods often struggle to balance fluency with constraint satisfaction, even with extensive tuning of the energy function's coefficients. In this paper, we identify that this suboptimal balance arises from sampling in continuous space rather than the natural discrete space of text tokens. To address this, we propose Discrete Auto-regressive Biasing, a controlled decoding algorithm that leverages gradients while operating entirely in the discrete text domain. Specifically, we introduce a new formulation for controlled text generation by defining a joint distribution over the generated sequence and an auxiliary bias sequence. To efficiently sample from this joint distribution, we propose a Langevin-within-Gibbs sampling algorithm using gradient-based discrete MCMC. Our method significantly improves constraint satisfaction while maintaining comparable or better fluency, all with even lower computational costs. We demonstrate the advantages of our controlled decoding method on sentiment control, language detoxification, and keyword-guided generation.

Paper Structure

This paper contains 71 sections, 18 equations, 4 figures, 9 tables, 1 algorithm.

Figures (4)

  • Figure 1: Visualization of our proposed controlled decoding algorithm, Discrete Auto-Regressive Biasing (DAB). Given an initial response that fails to satisfy some external constraint, DAB steers auto-regressive generation towards satisfactory generations using discrete bias tokens obtained via gradient-based discrete sampling from the constraint function.
  • Figure 2: Visualization of the proposed decoding algorithm, DAB. DAB alternates between sampling the response $Y$ and the bias $B$. To sample $B$ given $Y$, we use gradient-based discrete sampling on the constraint function $f$. To sample $Y$ given $B$, we compute a bias vector that penalizes words based on their distance to $B$ and then use this bias to guide the auto-regressive generation.
  • Figure 3: (a) Average hops, or token updates per sequence, against sampling steps. Both versions of BOLT suffer from decreasing hops while DAB remains stable. (b) Average number of the unique tokens sampled for each sequence position throughout the entire sampling process. DAB discovers many more unique tokens for each position than either variant of BOLT. (c) Comparison of fluency with respect to sampling steps. Dab exhibits stable fluency over sampling steps in comparison to BOLT.
  • Figure 4: (a) Ablation over different weight values. Higher values result in increase in terms of control with a decrease in fluency, representing the tradeoff between the two attributes. (b) Ablation over DLP proposal temperatures. Higher temperatures correspond to a flatter proposal distribution favoring exploration as opposed to exploitation, resulting in decreased control. (c) Ablation over top-k values. There is some optimal value that limits the search space sufficiently to enable effective exploration.