Table of Contents
Fetching ...

DomainSum: A Hierarchical Benchmark for Fine-Grained Domain Shift in Abstractive Text Summarization

Haohan Yuan, Haopeng Zhang

TL;DR

DomainSum, a hierarchical benchmark designed to capture fine-grained domain shifts in abstractive summarization, is introduced, and domain generalization capabilities of commonly used pre-trained language models and large language models in in-domain and cross-domain settings are evaluated.

Abstract

Most research on abstractive summarization focuses on single-domain applications, often neglecting how domain shifts between documents affect performance and the generalization ability of summarization models. To address this issue, we introduce DomainSum, a hierarchical benchmark designed to capture fine-grained domain shifts in abstractive summarization. We categorize these shifts into three levels: genre, style, and topic, and demonstrate through comprehensive benchmark analysis that they follow a hierarchical structure. Furthermore, we evaluate the domain generalization capabilities of commonly used pre-trained language models (PLMs) and large language models (LLMs) in in-domain and cross-domain settings.

DomainSum: A Hierarchical Benchmark for Fine-Grained Domain Shift in Abstractive Text Summarization

TL;DR

DomainSum, a hierarchical benchmark designed to capture fine-grained domain shifts in abstractive summarization, is introduced, and domain generalization capabilities of commonly used pre-trained language models and large language models in in-domain and cross-domain settings are evaluated.

Abstract

Most research on abstractive summarization focuses on single-domain applications, often neglecting how domain shifts between documents affect performance and the generalization ability of summarization models. To address this issue, we introduce DomainSum, a hierarchical benchmark designed to capture fine-grained domain shifts in abstractive summarization. We categorize these shifts into three levels: genre, style, and topic, and demonstrate through comprehensive benchmark analysis that they follow a hierarchical structure. Furthermore, we evaluate the domain generalization capabilities of commonly used pre-trained language models (PLMs) and large language models (LLMs) in in-domain and cross-domain settings.

Paper Structure

This paper contains 40 sections, 6 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: An illustration of varying domain shifts between content styles, with smaller shifts between news articles (CNN vs. Fox) and larger shifts between news articles and Reddit posts.
  • Figure 2: The overall hierarchical structure of DomainSum, featuring three granular levels of shifts (genre, style, and topic) and five distinct domains at each level.
  • Figure 3: Genre Shift
  • Figure 4: Style Shift
  • Figure 5: Topic Shift
  • ...and 4 more figures