Table of Contents
Fetching ...

RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text Generation Framework

Yifan Wang, Vera Demberg

TL;DR

This work introduces RSA-Control, a training-free controllable text generation framework grounded in pragmatics, which directs the generation process by recursively reasoning between imaginary speakers and listeners, enhancing the likelihood that target attributes are correctly interpreted by listeners amidst distractors.

Abstract

Despite significant advancements in natural language generation, controlling language models to produce texts with desired attributes remains a formidable challenge. In this work, we introduce RSA-Control, a training-free controllable text generation framework grounded in pragmatics. RSA-Control directs the generation process by recursively reasoning between imaginary speakers and listeners, enhancing the likelihood that target attributes are correctly interpreted by listeners amidst distractors. Additionally, we introduce a self-adjustable rationality parameter, which allows for automatic adjustment of control strength based on context. Our experiments, conducted with two task types and two types of language models, demonstrate that RSA-Control achieves strong attribute control while maintaining language fluency and content consistency. Our code is available at https://github.com/Ewanwong/RSA-Control.

RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text Generation Framework

TL;DR

This work introduces RSA-Control, a training-free controllable text generation framework grounded in pragmatics, which directs the generation process by recursively reasoning between imaginary speakers and listeners, enhancing the likelihood that target attributes are correctly interpreted by listeners amidst distractors.

Abstract

Despite significant advancements in natural language generation, controlling language models to produce texts with desired attributes remains a formidable challenge. In this work, we introduce RSA-Control, a training-free controllable text generation framework grounded in pragmatics. RSA-Control directs the generation process by recursively reasoning between imaginary speakers and listeners, enhancing the likelihood that target attributes are correctly interpreted by listeners amidst distractors. Additionally, we introduce a self-adjustable rationality parameter, which allows for automatic adjustment of control strength based on context. Our experiments, conducted with two task types and two types of language models, demonstrate that RSA-Control achieves strong attribute control while maintaining language fluency and content consistency. Our code is available at https://github.com/Ewanwong/RSA-Control.

Paper Structure

This paper contains 42 sections, 11 equations, 7 figures, 20 tables.

Figures (7)

  • Figure 1: Illustration of RSA-Control for generating readable summaries. Since $S_0$ assigns higher/lower probability to "sick" than "bedridden" when conditioned on readable/formal prompts, $L_1$ can infer that "sick" is more readable than "bedridden". $S_1$ then selects next tokens that are both readable and consistent with article content. Specifically, it first decodes with basic rationality $\alpha_0$, and the outputs are fed back into PLM and $L_1$ to compute a self-adjusted rationality parameter $\tilde{\alpha}_n$. The real decoding process is then performed with $\tilde{\alpha}_n$.
  • Figure 2: Continuations along with toxicity scores assigned by $L_1$ and Perspective API. Note that here toxicity scores from Perspective API are computed on the concatenation of prompt and continuation, while they pertain only to continuations elsewhere in this paper.
  • Figure 3: Toxic reduction results of RSA-Control with fixed (w/o) and self-adjustable (w) rationality parameters.
  • Figure 4: Ablation of conditional independence assumption. RSA (w) and RSA (w/o) indicate Prompt+RSA with control prompts with and without content components. Error bars represent 95% confidence interval.
  • Figure 5: Abilities of pragmatic listener $L_1$ in identifying six toxicity attributes and average performance.
  • ...and 2 more figures