Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models

Jongyoon Song; Nohil Park; Bongkyu Hwang; Jaewoong Yun; Seongho Joe; Youngjune L. Gwon; Sungroh Yoon

Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models

Jongyoon Song, Nohil Park, Bongkyu Hwang, Jaewoong Yun, Seongho Joe, Youngjune L. Gwon, Sungroh Yoon

TL;DR

This paper defines factual adaptiveness as robustness to entity-level knowledge conflict in fine-tuning based abstractive summarization and introduces two metrics, $M_{CL}$ and $M_{FC}$, to quantify it. It then proposes a controllable counterfactual data augmentation framework that uses parametric knowledge from pretrained language models to generate and utilize counterfactual entity replacements, with configurable augmentation ratio $\rho$ and candidate-group strategies. Across PEGASUS and BART on XSum and CNN/DailyMail, the method substantially improves factual adaptiveness while largely preserving factual consistency on original data, illustrating an orthogonal relationship between the two notions. Qualitative analyses show reduced entity-level hallucinations and demonstrate how augmentation group choices influence generalization. Overall, the work provides a practical approach to diagnosing and mitigating knowledge-conflict hallucinations in abstractive summarization, with potential for integration into contrastive learning pipelines and broader knowledge-conflict settings.

Abstract

Abstractive summarization models often generate factually inconsistent content particularly when the parametric knowledge of the model conflicts with the knowledge in the input document. In this paper, we analyze the robustness of fine-tuning based summarization models to the knowledge conflict, which we call factual adaptiveness. We utilize pre-trained language models to construct evaluation sets and find that factual adaptiveness is not strongly correlated with factual consistency on original datasets. Furthermore, we introduce a controllable counterfactual data augmentation method where the degree of knowledge conflict within the augmented data can be adjustable. Our experimental results on two pre-trained language models (PEGASUS and BART) and two fine-tuning datasets (XSum and CNN/DailyMail) demonstrate that our method enhances factual adaptiveness while achieving factual consistency on original datasets on par with the contrastive learning baseline.

Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models

TL;DR

This paper defines factual adaptiveness as robustness to entity-level knowledge conflict in fine-tuning based abstractive summarization and introduces two metrics,

and

, to quantify it. It then proposes a controllable counterfactual data augmentation framework that uses parametric knowledge from pretrained language models to generate and utilize counterfactual entity replacements, with configurable augmentation ratio

and candidate-group strategies. Across PEGASUS and BART on XSum and CNN/DailyMail, the method substantially improves factual adaptiveness while largely preserving factual consistency on original data, illustrating an orthogonal relationship between the two notions. Qualitative analyses show reduced entity-level hallucinations and demonstrate how augmentation group choices influence generalization. Overall, the work provides a practical approach to diagnosing and mitigating knowledge-conflict hallucinations in abstractive summarization, with potential for integration into contrastive learning pipelines and broader knowledge-conflict settings.

Abstract

Paper Structure (38 sections, 2 equations, 3 figures, 11 tables, 2 algorithms)

This paper contains 38 sections, 2 equations, 3 figures, 11 tables, 2 algorithms.

Introduction
Factual Adaptiveness
Formulation
Evaluation Set Construction
Counterfactual Entity Candidate Pool
Original Entity Candidates
Counterfactual Entity Candidates
Original and Counterfactual Entity Validation
Entity Replacement
Analysis on Models for Improving Factual Consistency
Setup
Evaluation Set
Results
Controllable Counterfactual Data Augmentation
Training Set Construction
...and 23 more sections

Figures (3)

Figure 1: Overview of the counterfactual sample construction process. The example is sampled from the XSum validation set.
Figure 2: The ratio of summaries generated from the counterfactual documents of XSum and CNN/DailyMail (Mid, S2) which include the counterfactual entity but do not include the original entity.
Figure 3: ChatGPT preference test results on (a) XSum and (b) CNN/DailyMail test sets.

Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models

TL;DR

Abstract

Entity-level Factual Adaptiveness of Fine-tuning based Abstractive Summarization Models

Authors

TL;DR

Abstract

Table of Contents

Figures (3)