Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs
Xun Liang, Hanyu Wang, Shichao Song, Mengting Hu, Xunzhi Wang, Zhiyu Li, Feiyu Xiong, Bo Tang
TL;DR
DATG tackles controlled text generation for large language models by introducing Dynamic Attribute Graphs that identify and influence a small set of attribute-related tokens in semantic space. The framework follows a four-stage pipeline—Contextual Corpus Construction, Attribute Classifier Scoring, Dynamic Attribute Graphs Construction, and Regeneration with Dynamic Boundary Controlling—to move outputs toward target attributes with minimal disruption to fluent generation, achieving up to $19.29\%$ improvement in toxicity control. Across two tasks (toxicity mitigation and sentiment transformation) and five base LLMs, DATG demonstrates superior control accuracy and competitive or improved perplexity, thanks to graph-based token prioritization and targeted logits/prompt adjustments (Logits-Boost and Prefix-Prompt). The results highlight the practical potential of plug-and-play graph models for CTG, offering a flexible alternative to retraining or heavy decoding-time intervention, and suggesting avenues for speedups via preconstructed attribute graphs. Overall, DATG advances CTG by marrying graph-based analysis with LLM decoding to balance attribute control, content fidelity, and fluency in real-world text generation.
Abstract
Controlled Text Generation (CTG) aims to produce texts that exhibit specific desired attributes. In this study, we introduce a pluggable CTG framework for Large Language Models (LLMs) named Dynamic Attribute Graphs-based controlled text generation (DATG). This framework utilizes an attribute scorer to evaluate the attributes of sentences generated by LLMs and constructs dynamic attribute graphs. DATG modulates the occurrence of key attribute words and key anti-attribute words, achieving effective attribute control without compromising the original capabilities of the model. We conduct experiments across four datasets in two tasks: toxicity mitigation and sentiment transformation, employing five LLMs as foundational models. Our findings highlight a remarkable enhancement in control accuracy, achieving a peak improvement of 19.29% over baseline methods in the most favorable task across four datasets. Additionally, we observe a significant decrease in perplexity, markedly improving text fluency.
