Table of Contents
Fetching ...

Reinforcement Learning with Token-level Feedback for Controllable Text Generation

Wendi Li, Wei Wei, Kaihe Xu, Wenfeng Xie, Dangyang Chen, Yu Cheng

TL;DR

The paper tackles the challenge of controllable text generation (CTG) by moving from coarse sentence-level reinforcement learning signals to token-level rewards. It introduces TOLE, a token-level RL framework that derives rewards from probability shifts of attribute classifiers using Bayesian factorization, and employs a 'first quantize, then noise' strategy to robustify learning. A lightweight weigher enables multi-attribute control without heavy computation, allowing TOLE to handle single- and multi-attribute tasks efficiently. Across sentiment control, detoxification, and multi-attribute settings, TOLE achieves superior attribute control and fluency compared with baselines, highlighting its practical potential for real-world deployment while acknowledging remaining gaps toward perfect controllability and broader model generalization.

Abstract

To meet the requirements of real-world applications, it is essential to control generations of large language models (LLMs). Prior research has tried to introduce reinforcement learning (RL) into controllable text generation while most existing methods suffer from overfitting issues (finetuning-based methods) or semantic collapse (post-processing methods). However, current RL methods are generally guided by coarse-grained (sentence/paragraph-level) feedback, which may lead to suboptimal performance owing to semantic twists or progressions within sentences. To tackle that, we propose a novel reinforcement learning algorithm named TOLE which formulates TOken-LEvel rewards for controllable text generation, and employs a "first-quantize-then-noise" paradigm to enhance the robustness of the RL algorithm.Furthermore, TOLE can be flexibly extended to multiple constraints with little computational expense. Experimental results show that our algorithm can achieve superior performance on both single-attribute and multi-attribute control tasks. We have released our codes at https://github.com/WindyLee0822/CTG

Reinforcement Learning with Token-level Feedback for Controllable Text Generation

TL;DR

The paper tackles the challenge of controllable text generation (CTG) by moving from coarse sentence-level reinforcement learning signals to token-level rewards. It introduces TOLE, a token-level RL framework that derives rewards from probability shifts of attribute classifiers using Bayesian factorization, and employs a 'first quantize, then noise' strategy to robustify learning. A lightweight weigher enables multi-attribute control without heavy computation, allowing TOLE to handle single- and multi-attribute tasks efficiently. Across sentiment control, detoxification, and multi-attribute settings, TOLE achieves superior attribute control and fluency compared with baselines, highlighting its practical potential for real-world deployment while acknowledging remaining gaps toward perfect controllability and broader model generalization.

Abstract

To meet the requirements of real-world applications, it is essential to control generations of large language models (LLMs). Prior research has tried to introduce reinforcement learning (RL) into controllable text generation while most existing methods suffer from overfitting issues (finetuning-based methods) or semantic collapse (post-processing methods). However, current RL methods are generally guided by coarse-grained (sentence/paragraph-level) feedback, which may lead to suboptimal performance owing to semantic twists or progressions within sentences. To tackle that, we propose a novel reinforcement learning algorithm named TOLE which formulates TOken-LEvel rewards for controllable text generation, and employs a "first-quantize-then-noise" paradigm to enhance the robustness of the RL algorithm.Furthermore, TOLE can be flexibly extended to multiple constraints with little computational expense. Experimental results show that our algorithm can achieve superior performance on both single-attribute and multi-attribute control tasks. We have released our codes at https://github.com/WindyLee0822/CTG
Paper Structure (25 sections, 8 equations, 6 figures, 10 tables)

This paper contains 25 sections, 8 equations, 6 figures, 10 tables.

Figures (6)

  • Figure 1: Overall Framework of our algorithm.
  • Figure 2: Performance of sentiment control with respect to training steps. "none" denotes the variance of no "quantize" nor "noise". "gauss." denotes the standard Tole with guassian noise. "sent." denotes the variance with sentence-level feedback.
  • Figure 3: The performance comparison between model variances with or without quantization procedure. The above two subgraphs are from neutral-to-positive experiments. The below are from detoxification.
  • Figure 4: Final scores of generated samples in explorations. The left is the average of two classifiers. The right is aggregated by "weigher".
  • Figure 5: Caption
  • ...and 1 more figures