Table of Contents
Fetching ...

Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data

Katja Filippova

TL;DR

The paper tackles data-induced hallucinations in neural text generation by introducing a low-overhead, architecture-agnostic Hallucination knob that prefixes inputs with a hallucination level to control output faithfulness. It defines two scalable hallucination-detection schemes, $hal_{WO}$ and $hal_{LM}$, to label training examples with noise levels, which are then used to condition generation. In experiments on WikiBio, controlled models achieve substantially higher faithfulness with preserved fluency and comparable coverage, and LM-based detection often yields better human-evaluated quality than overlap-based detection. The work demonstrates that faithful, fluent, and comprehensive outputs can be achieved without modifying model architectures, suggesting practical applicability to noisy data regimes and broader controlled-generation tasks.

Abstract

Neural text generation (data- or text-to-text) demonstrates remarkable performance when training data is abundant which for many applications is not the case. To collect a large corpus of parallel data, heuristic rules are often used but they inevitably let noise into the data, such as phrases in the output which cannot be explained by the input. Consequently, models pick up on the noise and may hallucinate--generate fluent but unsupported text. Our contribution is a simple but powerful technique to treat such hallucinations as a controllable aspect of the generated text, without dismissing any input and without modifying the model architecture. On the WikiBio corpus (Lebret et al., 2016), a particularly noisy dataset, we demonstrate the efficacy of the technique both in an automatic and in a human evaluation.

Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data

TL;DR

The paper tackles data-induced hallucinations in neural text generation by introducing a low-overhead, architecture-agnostic Hallucination knob that prefixes inputs with a hallucination level to control output faithfulness. It defines two scalable hallucination-detection schemes, and , to label training examples with noise levels, which are then used to condition generation. In experiments on WikiBio, controlled models achieve substantially higher faithfulness with preserved fluency and comparable coverage, and LM-based detection often yields better human-evaluated quality than overlap-based detection. The work demonstrates that faithful, fluent, and comprehensive outputs can be achieved without modifying model architectures, suggesting practical applicability to noisy data regimes and broader controlled-generation tasks.

Abstract

Neural text generation (data- or text-to-text) demonstrates remarkable performance when training data is abundant which for many applications is not the case. To collect a large corpus of parallel data, heuristic rules are often used but they inevitably let noise into the data, such as phrases in the output which cannot be explained by the input. Consequently, models pick up on the noise and may hallucinate--generate fluent but unsupported text. Our contribution is a simple but powerful technique to treat such hallucinations as a controllable aspect of the generated text, without dismissing any input and without modifying the model architecture. On the WikiBio corpus (Lebret et al., 2016), a particularly noisy dataset, we demonstrate the efficacy of the technique both in an automatic and in a human evaluation.

Paper Structure

This paper contains 14 sections, 2 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Two WikiBio sources and targets with example attributes: tense and length can be read-off the target directly. When added to the input, the model gets a knob to control for length and tense. We propose to estimate the noise degree by comparing the source with the target thus obtaining a hallucination knob.
  • Figure 2: Example outputs from our $hal_{LM}$ model on the same input table with three hallucination degrees.