Table of Contents
Fetching ...

Palette of Language Models: A Solver for Controlled Text Generation

Zhe Yang, Yi Huang, Yaqin Chen, Xiaoting Wu, Junlan Feng, Chao Deng

TL;DR

The paper tackles controlled text generation with multiple attributes by addressing attribute overlap that plagues simple linear combinations. It introduces the Palette of Language Models, a probabilistic fusion framework rooted in the Law of Total Probability and Conditional Mutual Information Minimization to derive dynamic, attribute-aware combination coefficients. It also presents two theoretical properties—positive correlation between attribute strength and generation style, and attribute enhancement over linear fusion—to guide design. Empirical results across toxicity reduction, sentiment control, and multi-attribute scenarios show improved attribute expression and overlap mitigation across several base models, with a practical discussion of normalization and prompts. The approach offers a scalable, principled alternative to discriminative or purely prompt-based methods for controlled generation, though limitations remain in handling vocabularies across heterogeneous models.

Abstract

Recent advancements in large language models have revolutionized text generation with their remarkable capabilities. These models can produce controlled texts that closely adhere to specific requirements when prompted appropriately. However, designing an optimal prompt to control multiple attributes simultaneously can be challenging. A common approach is to linearly combine single-attribute models, but this strategy often overlooks attribute overlaps and can lead to conflicts. Therefore, we propose a novel combination strategy inspired by the Law of Total Probability and Conditional Mutual Information Minimization on generative language models. This method has been adapted for single-attribute control scenario and is termed the Palette of Language Models due to its theoretical linkage between attribute strength and generation style, akin to blending colors on an artist's palette. Moreover, positive correlation and attribute enhancement are advanced as theoretical properties to guide a rational combination strategy design. We conduct experiments on both single control and multiple control settings, and achieve surpassing results.

Palette of Language Models: A Solver for Controlled Text Generation

TL;DR

The paper tackles controlled text generation with multiple attributes by addressing attribute overlap that plagues simple linear combinations. It introduces the Palette of Language Models, a probabilistic fusion framework rooted in the Law of Total Probability and Conditional Mutual Information Minimization to derive dynamic, attribute-aware combination coefficients. It also presents two theoretical properties—positive correlation between attribute strength and generation style, and attribute enhancement over linear fusion—to guide design. Empirical results across toxicity reduction, sentiment control, and multi-attribute scenarios show improved attribute expression and overlap mitigation across several base models, with a practical discussion of normalization and prompts. The approach offers a scalable, principled alternative to discriminative or purely prompt-based methods for controlled generation, though limitations remain in handling vocabularies across heterogeneous models.

Abstract

Recent advancements in large language models have revolutionized text generation with their remarkable capabilities. These models can produce controlled texts that closely adhere to specific requirements when prompted appropriately. However, designing an optimal prompt to control multiple attributes simultaneously can be challenging. A common approach is to linearly combine single-attribute models, but this strategy often overlooks attribute overlaps and can lead to conflicts. Therefore, we propose a novel combination strategy inspired by the Law of Total Probability and Conditional Mutual Information Minimization on generative language models. This method has been adapted for single-attribute control scenario and is termed the Palette of Language Models due to its theoretical linkage between attribute strength and generation style, akin to blending colors on an artist's palette. Moreover, positive correlation and attribute enhancement are advanced as theoretical properties to guide a rational combination strategy design. We conduct experiments on both single control and multiple control settings, and achieve surpassing results.

Paper Structure

This paper contains 29 sections, 23 equations, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Overview of Palette of Language Models. Each ellipse in the figure represents a generative language model with a specific attribute, and $S$ represents the strength of the corresponding model. Employing Equation \ref{['equ:comb_norm']}, the final generation under multiple constraints is derived.
  • Figure 2: Positive Correlation between attribute strength & sentiment score (Left: $s\!\!<\!\!1$, Right: $s\!\!>\!\!1$).
  • Figure 3: Coefficient $t$ of $logp(A_{i} \neq x)$ evaluation on Positive Sentiment Control scenario.
  • Figure 4: Coefficient $t$ of $logp(A_{i} \neq x)$ evaluation on Negative Sentiment Control scenario.