Table of Contents
Fetching ...

Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs

Xun Liang, Hanyu Wang, Shichao Song, Mengting Hu, Xunzhi Wang, Zhiyu Li, Feiyu Xiong, Bo Tang

TL;DR

DATG tackles controlled text generation for large language models by introducing Dynamic Attribute Graphs that identify and influence a small set of attribute-related tokens in semantic space. The framework follows a four-stage pipeline—Contextual Corpus Construction, Attribute Classifier Scoring, Dynamic Attribute Graphs Construction, and Regeneration with Dynamic Boundary Controlling—to move outputs toward target attributes with minimal disruption to fluent generation, achieving up to $19.29\%$ improvement in toxicity control. Across two tasks (toxicity mitigation and sentiment transformation) and five base LLMs, DATG demonstrates superior control accuracy and competitive or improved perplexity, thanks to graph-based token prioritization and targeted logits/prompt adjustments (Logits-Boost and Prefix-Prompt). The results highlight the practical potential of plug-and-play graph models for CTG, offering a flexible alternative to retraining or heavy decoding-time intervention, and suggesting avenues for speedups via preconstructed attribute graphs. Overall, DATG advances CTG by marrying graph-based analysis with LLM decoding to balance attribute control, content fidelity, and fluency in real-world text generation.

Abstract

Controlled Text Generation (CTG) aims to produce texts that exhibit specific desired attributes. In this study, we introduce a pluggable CTG framework for Large Language Models (LLMs) named Dynamic Attribute Graphs-based controlled text generation (DATG). This framework utilizes an attribute scorer to evaluate the attributes of sentences generated by LLMs and constructs dynamic attribute graphs. DATG modulates the occurrence of key attribute words and key anti-attribute words, achieving effective attribute control without compromising the original capabilities of the model. We conduct experiments across four datasets in two tasks: toxicity mitigation and sentiment transformation, employing five LLMs as foundational models. Our findings highlight a remarkable enhancement in control accuracy, achieving a peak improvement of 19.29% over baseline methods in the most favorable task across four datasets. Additionally, we observe a significant decrease in perplexity, markedly improving text fluency.

Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs

TL;DR

DATG tackles controlled text generation for large language models by introducing Dynamic Attribute Graphs that identify and influence a small set of attribute-related tokens in semantic space. The framework follows a four-stage pipeline—Contextual Corpus Construction, Attribute Classifier Scoring, Dynamic Attribute Graphs Construction, and Regeneration with Dynamic Boundary Controlling—to move outputs toward target attributes with minimal disruption to fluent generation, achieving up to improvement in toxicity control. Across two tasks (toxicity mitigation and sentiment transformation) and five base LLMs, DATG demonstrates superior control accuracy and competitive or improved perplexity, thanks to graph-based token prioritization and targeted logits/prompt adjustments (Logits-Boost and Prefix-Prompt). The results highlight the practical potential of plug-and-play graph models for CTG, offering a flexible alternative to retraining or heavy decoding-time intervention, and suggesting avenues for speedups via preconstructed attribute graphs. Overall, DATG advances CTG by marrying graph-based analysis with LLM decoding to balance attribute control, content fidelity, and fluency in real-world text generation.

Abstract

Controlled Text Generation (CTG) aims to produce texts that exhibit specific desired attributes. In this study, we introduce a pluggable CTG framework for Large Language Models (LLMs) named Dynamic Attribute Graphs-based controlled text generation (DATG). This framework utilizes an attribute scorer to evaluate the attributes of sentences generated by LLMs and constructs dynamic attribute graphs. DATG modulates the occurrence of key attribute words and key anti-attribute words, achieving effective attribute control without compromising the original capabilities of the model. We conduct experiments across four datasets in two tasks: toxicity mitigation and sentiment transformation, employing five LLMs as foundational models. Our findings highlight a remarkable enhancement in control accuracy, achieving a peak improvement of 19.29% over baseline methods in the most favorable task across four datasets. Additionally, we observe a significant decrease in perplexity, markedly improving text fluency.
Paper Structure (45 sections, 13 equations, 4 figures, 10 tables)

This paper contains 45 sections, 13 equations, 4 figures, 10 tables.

Figures (4)

  • Figure 1: Illustration of the impact of key words on text attributes within the semantic space.
  • Figure 2: DATG unfolds in four stages: (1) Contextual Corpus Construction, using LLMs to generate text sequences from specified prompts; (2) Attribute Classifier Scoring, employing classifiers to evaluate texts against target attributes; (3) Dynamic Attribute Graphs Construction, forming attribute graphs based on classifier-informed token linkages, encapsulating texts' compliance and divergence from the target attribute in semantic space; (4) ReGeneration with Dynamic Boundary Controlling, applying graph ranking to identify and adjust key nodes, guiding text toward the desired attribute boundary via logits-boost and prefix-prompt strategies.
  • Figure 3: Generation speed of toxicity task measured in seconds per item (s/item) on 2x Nvidia A100 GPUs.
  • Figure 4: Generation speed of sentiment task measured in seconds per item (s/item) on 2x Nvidia A100 GPUs.