Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding

Lifu Tu; Semih Yavuz; Jin Qu; Jiacheng Xu; Rui Meng; Caiming Xiong; Yingbo Zhou

Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding

Lifu Tu, Semih Yavuz, Jin Qu, Jiacheng Xu, Rui Meng, Caiming Xiong, Yingbo Zhou

TL;DR

This work proposes formalizing text generation as a future-constrained generation problem to minimize undesirable behaviors and enforce faithfulness to instructions and demonstrates the effectiveness of the proposed approach across three distinct text generation tasks.

Abstract

Large Language Models (LLMs) have demonstrated a powerful ability for text generation. However, achieving optimal results with a given prompt or instruction can be challenging, especially for billion-sized models. Additionally, undesired behaviors such as toxicity or hallucinations can manifest. While much larger models (e.g., ChatGPT) may demonstrate strength in mitigating these issues, there is still no guarantee of complete prevention. In this work, we propose formalizing text generation as a future-constrained generation problem to minimize undesirable behaviors and enforce faithfulness to instructions. The estimation of future constraint satisfaction, accomplished using LLMs, guides the text generation process. Our extensive experiments demonstrate the effectiveness of the proposed approach across three distinct text generation tasks: keyword-constrained generation (Lin et al., 2020), toxicity reduction (Gehman et al., 2020), and factual correctness in question-answering (Gao et al., 2023).

Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding

TL;DR

Abstract

Paper Structure (37 sections, 5 equations, 8 figures, 11 tables)

This paper contains 37 sections, 5 equations, 8 figures, 11 tables.

Introduction
Method
Estimation of Future Constraint Satisfaction
Inference
Experiments
Keyword-constrained Generation
Lexical-Constraint Satisfaction Evaluation.
Hyperparameter Selection.
Results.
Comparison with NeuroLogic-A*.
Toxicity Reduction
Toxicity-Constraint Satisfaction Evaluation
Results.
Factual Question Answering
Baselines.
...and 22 more sections

Figures (8)

Figure 1: An illustration of the proposed approach utilizing future constraint satisfaction to guide generation. In this example, although "summer" is a more likely next token, generating it will lead to a lower score in the future constraint, which includes the keyword "snow". Our method incorporates future constraint satisfaction, making "winter" a more preferable choice.
Figure 2: Accuracy of the estimation of lexical constraint satisfaction with different models. For NLI-based model, non-entailment probability are used for ranking.
Figure 3: Performance (y-axis) of Falcon-7B-Instruct in terms of BLEU-4 score and constraint coverage with different $\lambda$ (x-axis) on the CommonGen development set.
Figure 4: Speed ( inference time per example ) and performance (Coverage score) of different decoding methods (with the same batch size 1 and beam size 5.). Falcon-7B-Instruct is used in this experiment. 1 A100 with 40G is used.
Figure 5: Accuracy of the estimation of constraint satisfaction with different pretrained LLMs.
...and 3 more figures

Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding

TL;DR

Abstract

Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding

Authors

TL;DR

Abstract

Table of Contents

Figures (8)