Table of Contents
Fetching ...

Locate&Edit: Energy-based Text Editing for Efficient, Flexible, and Faithful Controlled Text Generation

Hye Ryung Son, Jay-Yoon Lee

TL;DR

Locate&Edit (L&E) introduces an energy-based, text-editing approach to controlled text generation that operates solely on base LM outputs, enabling compatibility with black-box LMs. The method locates constraint-relevant spans and replaces them using an off-the-shelf masked language model, with final ranking guided by learned energy functions. Training energy models as regressors with continuous labels improves constraint satisfaction and timing, and a beam-search-based reranking strategy achieves strong performance with practical decoding speed. Across toxicity avoidance, sentiment control, and formality transfer, L&E consistently preserves base LM content while meeting constraints, highlighting its practical impact for safe and controllable generation in real-world deployments.

Abstract

Recent approaches to controlled text generation (CTG) often involve manipulating the weights or logits of base language models (LMs) at decoding time. However, these methods are inapplicable to latest black-box LMs and ineffective at preserving the core semantics of the base LM's original generations. In this work, we propose Locate&Edit(L&E), an efficient and flexible energy-based approach to CTG, which edits text outputs from a base LM using off-the-shelf energy models. Given text outputs from the base LM, L&E first locates spans that are most relevant to constraints (e.g., toxicity) utilizing energy models, and then edits these spans by replacing them with more suitable alternatives. Importantly, our method is compatible with black-box LMs, as it requires only the text outputs. Also, since L&E doesn't mandate specific architecture for its component models, it can work with a diverse combination of available off-the-shelf models. Moreover, L&E preserves the base LM's original generations, by selectively modifying constraint-related aspects of the texts and leaving others unchanged. These targeted edits also ensure that L&E operates efficiently. Our experiments confirm that L&E achieves superior semantic preservation of the base LM generations and speed, while simultaneously obtaining competitive or improved constraint satisfaction. Furthermore, we analyze how the granularity of energy distribution impacts CTG performance and find that fine-grained, regression-based energy models improve constraint satisfaction, compared to conventional binary classifier energy models.

Locate&Edit: Energy-based Text Editing for Efficient, Flexible, and Faithful Controlled Text Generation

TL;DR

Locate&Edit (L&E) introduces an energy-based, text-editing approach to controlled text generation that operates solely on base LM outputs, enabling compatibility with black-box LMs. The method locates constraint-relevant spans and replaces them using an off-the-shelf masked language model, with final ranking guided by learned energy functions. Training energy models as regressors with continuous labels improves constraint satisfaction and timing, and a beam-search-based reranking strategy achieves strong performance with practical decoding speed. Across toxicity avoidance, sentiment control, and formality transfer, L&E consistently preserves base LM content while meeting constraints, highlighting its practical impact for safe and controllable generation in real-world deployments.

Abstract

Recent approaches to controlled text generation (CTG) often involve manipulating the weights or logits of base language models (LMs) at decoding time. However, these methods are inapplicable to latest black-box LMs and ineffective at preserving the core semantics of the base LM's original generations. In this work, we propose Locate&Edit(L&E), an efficient and flexible energy-based approach to CTG, which edits text outputs from a base LM using off-the-shelf energy models. Given text outputs from the base LM, L&E first locates spans that are most relevant to constraints (e.g., toxicity) utilizing energy models, and then edits these spans by replacing them with more suitable alternatives. Importantly, our method is compatible with black-box LMs, as it requires only the text outputs. Also, since L&E doesn't mandate specific architecture for its component models, it can work with a diverse combination of available off-the-shelf models. Moreover, L&E preserves the base LM's original generations, by selectively modifying constraint-related aspects of the texts and leaving others unchanged. These targeted edits also ensure that L&E operates efficiently. Our experiments confirm that L&E achieves superior semantic preservation of the base LM generations and speed, while simultaneously obtaining competitive or improved constraint satisfaction. Furthermore, we analyze how the granularity of energy distribution impacts CTG performance and find that fine-grained, regression-based energy models improve constraint satisfaction, compared to conventional binary classifier energy models.
Paper Structure (46 sections, 6 equations, 2 figures, 21 tables, 1 algorithm)

This paper contains 46 sections, 6 equations, 2 figures, 21 tables, 1 algorithm.

Figures (2)

  • Figure 1: Illustration of Locate&Edit (L&E). The text generated by an unconstrained LM is refined by locating relevant spans, generating candidate replacements, and reranking. L&E can control black-box LMs as L&E solely requires their text outputs. Furthermore, individual components of L&E, such as energy models($f_{i}$) and MLM, are trained in an isolated manner, independent of base LM and other components, allowing plug-and-play of off-the-shelf models.
  • Figure 2: Ground truth label distributions of energy model training data. Notice that Yelp reviews dataset used for training sentiment energy model has labels that are essentially discrete rather than continuous.