On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization

Lorenzo Jaime Yu Flores; Arman Cohan

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization

Lorenzo Jaime Yu Flores, Arman Cohan

TL;DR

This work evaluates Loss Truncation (LT) as a remedy for hallucinations in abstractive summarization and simplification, revealing that LT’s sentence-level NLL-based filtering often fails when noisy targets do not exhibit higher NLL. By analyzing token- and entity-level losses, the authors show that entity-token NLL provides a stronger signal for noise, motivating a fine-grained LT that operates on entity tokens. They further propose simple data-cleaning strategies (Drop Sentence/Drop Example) to handle noisier, web-scraped data, achieving notable reductions in entity-level hallucinations on some datasets (e.g., Cochrane and ASSET) and competitive fluency metrics. The findings suggest that combining fine-grained loss signals with targeted data cleaning can meaningfully improve factuality in certain contexts, though gains are dataset-dependent and challenges remain for noisier corpora.

Abstract

Text summarization and simplification are among the most widely used applications of AI. However, models developed for such tasks are often prone to hallucination, which can result from training on unaligned data. One efficient approach to address this issue is Loss Truncation (LT) (Kang and Hashimoto, 2020), an approach to modify the standard log loss to adaptively remove noisy examples during training. However, we find that LT alone yields a considerable number of hallucinated entities on various datasets. We study the behavior of the underlying losses between factual and non-factual examples, to understand and refine the performance of LT. We demonstrate that LT's performance is limited when the underlying assumption that noisy targets have higher NLL loss is not satisfied, and find that word-level NLL among entities provides better signal for distinguishing factuality. We then leverage this to propose a fine-grained NLL loss and fine-grained data cleaning strategies, and observe improvements in hallucination reduction across some datasets. Our work is available at https://https://github.com/yale-nlp/fine-grained-lt.

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization

TL;DR

Abstract

Paper Structure (32 sections, 2 equations, 2 figures, 10 tables)

This paper contains 32 sections, 2 equations, 2 figures, 10 tables.

Introduction
Methodology
Loss Truncation
Datasets
Models
Finetuning
Entity-Based Hallucination
Metrics
Experimental setup
Findings
Noise in summarization can come from adding unsupported information in the reference
LT reduces entity-level hallucination from noisy targets, but not completely
We hypothesize that LT's performance suffers because of the unmet assumption of higher NLL in noisy data
Word-level NLL may better distinguish between factual vs non-factual examples
We propose a fine-grained LT, which reduces hallucination on moderately noisy datasets
...and 17 more sections

Figures (2)

Figure 1: NLL distribution of factual (Orange) and non-factual (Blue) targets have no difference before finetuning, and a slight difference after one epoch of finetuning, with non-factual entities having slightly higher NLL
Figure 2: Loss curves from finetuned BART-XSum; 0.8 smoothing used in top row

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization

TL;DR

Abstract

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization

Authors

TL;DR

Abstract

Table of Contents

Figures (2)