Table of Contents
Fetching ...

Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning

Chujie Zheng, Pei Ke, Zheng Zhang, Minlie Huang

TL;DR

Click introduces a training-based approach for controllable text generation that requires no architecture changes by applying a max-margin, sequence-level contrastive loss on top of standard language modeling. A novel likelihood ranking-based strategy constructs contrastive sample pairs from model generations, aligning sequence likelihood with the controlled attributes to focus optimization. Across language detoxification, sentiment steering, and repetition reduction, Click outperforms strong baselines and ablation studies validate the efficacy of the sampling strategy and hyperparameters. The method enables out-of-the-box use of pretrained models with improved safety and controllability for open-ended text generation.

Abstract

It has always been an important yet challenging problem to control language models to avoid generating texts with undesirable attributes, such as toxic language and unnatural repetition. We introduce Click for controllable text generation, which needs no modification to the model architecture and facilitates out-of-the-box use of trained models. It employs a contrastive loss on sequence likelihood, which fundamentally decreases the generation probability of negative samples (i.e., generations with undesirable attributes). It also adopts a novel likelihood ranking-based strategy to construct contrastive samples from model generations. On the tasks of language detoxification, sentiment steering, and repetition reduction, we show that Click outperforms strong baselines of controllable text generation and demonstrate the superiority of Click's sample construction strategy.

Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning

TL;DR

Click introduces a training-based approach for controllable text generation that requires no architecture changes by applying a max-margin, sequence-level contrastive loss on top of standard language modeling. A novel likelihood ranking-based strategy constructs contrastive sample pairs from model generations, aligning sequence likelihood with the controlled attributes to focus optimization. Across language detoxification, sentiment steering, and repetition reduction, Click outperforms strong baselines and ablation studies validate the efficacy of the sampling strategy and hyperparameters. The method enables out-of-the-box use of pretrained models with improved safety and controllability for open-ended text generation.

Abstract

It has always been an important yet challenging problem to control language models to avoid generating texts with undesirable attributes, such as toxic language and unnatural repetition. We introduce Click for controllable text generation, which needs no modification to the model architecture and facilitates out-of-the-box use of trained models. It employs a contrastive loss on sequence likelihood, which fundamentally decreases the generation probability of negative samples (i.e., generations with undesirable attributes). It also adopts a novel likelihood ranking-based strategy to construct contrastive samples from model generations. On the tasks of language detoxification, sentiment steering, and repetition reduction, we show that Click outperforms strong baselines of controllable text generation and demonstrate the superiority of Click's sample construction strategy.
Paper Structure (48 sections, 5 equations, 9 figures, 13 tables)

This paper contains 48 sections, 5 equations, 9 figures, 13 tables.

Figures (9)

  • Figure 1: Overview of Click. It contains three steps: (1) Generating multiple continuations given a prompt, which are labeled as positive/negative by a label function. (2) Constructing contrastive samples by pairing each negative sample with the positive one whose likelihood ranks highest but lower than the former (§ \ref{['subsec:construction']}). (3) Training the language model with the additional contrastive loss (§ \ref{['subsec:contrastive']}).
  • Figure 2: Performance of Click (y-axis) on the BAD validation set with varying $\alpha$ and $\gamma$ (x-axis).
  • Figure 3: Performance of Click (y-axis) on the BAD validation set with varying pair number (x-axis) of contrastive samples per prompt $x$.
  • Figure 4: Screenshot of the Amazon Mechanical Turk interface of human evaluation for the language detoxification task (§ \ref{['subsec:toxicity']}).
  • Figure 5: Screenshot of the Amazon Mechanical Turk interface of human evaluation for the sentiment steering task (§ \ref{['subsec:sentiment']}).
  • ...and 4 more figures