Table of Contents
Fetching ...

SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation

Song Duong, Florian Le Bronnec, Alexandre Allauzen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari

TL;DR

This paper tackles the problem of hallucinations in conditional text generation by focusing on faithfulness to input context. It introduces Scope, a two-stage, self-supervised framework that first fine-tunes on data and then performs preference-tuning using synthetic unfaithful samples generated via a noisy decoding process that blends grounded and context-free language models. The training objective (a DPO-like loss) encourages the model to prefer grounded outputs over ungrounded ones, producing more faithful generations across data-to-text and summarization tasks. Extensive experiments on seven datasets, multiple architectures, and a suite of faithfulness metrics—including GPT-4 and human evaluations—demonstrate that Scope yields significant improvements in faithfulness with robust cross-domain performance. The work highlights the importance of carefully balancing negative sample quality and provides insights into the dynamics of self-supervised preference learning for faithful generation.

Abstract

Large Language Models (LLMs), when used for conditional text generation, often produce hallucinations, i.e., information that is unfaithful or not grounded in the input context. This issue arises in typical conditional text generation tasks, such as text summarization and data-to-text generation, where the goal is to produce fluent text based on contextual input. When fine-tuned on specific domains, LLMs struggle to provide faithful answers to a given context, often adding information or generating errors. One underlying cause of this issue is that LLMs rely on statistical patterns learned from their training data. This reliance can interfere with the model's ability to stay faithful to a provided context, leading to the generation of ungrounded information. We build upon this observation and introduce a novel self-supervised method for generating a training set of unfaithful samples. We then refine the model using a training process that encourages the generation of grounded outputs over unfaithful ones, drawing on preference-based training. Our approach leads to significantly more grounded text generation, outperforming existing self-supervised techniques in faithfulness, as evaluated through automatic metrics, LLM-based assessments, and human evaluations.

SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation

TL;DR

This paper tackles the problem of hallucinations in conditional text generation by focusing on faithfulness to input context. It introduces Scope, a two-stage, self-supervised framework that first fine-tunes on data and then performs preference-tuning using synthetic unfaithful samples generated via a noisy decoding process that blends grounded and context-free language models. The training objective (a DPO-like loss) encourages the model to prefer grounded outputs over ungrounded ones, producing more faithful generations across data-to-text and summarization tasks. Extensive experiments on seven datasets, multiple architectures, and a suite of faithfulness metrics—including GPT-4 and human evaluations—demonstrate that Scope yields significant improvements in faithfulness with robust cross-domain performance. The work highlights the importance of carefully balancing negative sample quality and provides insights into the dynamics of self-supervised preference learning for faithful generation.

Abstract

Large Language Models (LLMs), when used for conditional text generation, often produce hallucinations, i.e., information that is unfaithful or not grounded in the input context. This issue arises in typical conditional text generation tasks, such as text summarization and data-to-text generation, where the goal is to produce fluent text based on contextual input. When fine-tuned on specific domains, LLMs struggle to provide faithful answers to a given context, often adding information or generating errors. One underlying cause of this issue is that LLMs rely on statistical patterns learned from their training data. This reliance can interfere with the model's ability to stay faithful to a provided context, leading to the generation of ungrounded information. We build upon this observation and introduce a novel self-supervised method for generating a training set of unfaithful samples. We then refine the model using a training process that encourages the generation of grounded outputs over unfaithful ones, drawing on preference-based training. Our approach leads to significantly more grounded text generation, outperforming existing self-supervised techniques in faithfulness, as evaluated through automatic metrics, LLM-based assessments, and human evaluations.
Paper Structure (64 sections, 2 equations, 6 figures, 27 tables, 2 algorithms)

This paper contains 64 sections, 2 equations, 6 figures, 27 tables, 2 algorithms.

Figures (6)

  • Figure 1: Scope training framework. A pre-trained model $p_{\text{LM}}$ is first fine-tuned on a subset $\mathcal{D}_1$ of $\mathcal{D}$ and produces a model $p_{\theta_0}$. A mixture of $p_{\text{LM}}$ and $p_{\theta_0}$ is then used to generate a synthetic preference dataset, which finally serves for preference fine-tuning.
  • Figure 2: Preference training dynamics with Llama-2-7b as noise level $\alpha$ increases on ToTTo dataset. Illustration of the three different regimes during preference training. Blue (resp. red) curve corresponds the log probability of the reference labels (resp. of the synthetic unfaithful samples).
  • Figure 3: Evolution of NLI Score and BLEU with $\alpha$ on ToTTo validation set with Llama-2-7b.
  • Figure 4: Preference training dynamics with Llama-2-7b as noise level $\alpha$ increases on SAMSum dataset. We observe the same three different regimes during preference training than for data-to-text generation.
  • Figure 5: Evolution of AlignScore and Rouge-L with $\alpha$ on SAMSum validation set with Llama-2-7b.
  • ...and 1 more figures