Table of Contents
Fetching ...

Which Information Matters? Dissecting Human-written Multi-document Summaries with Partial Information Decomposition

Laura Mascarell, Yan L'Homme, Majed El Helou

TL;DR

The paper investigates what information makes human-written multi-document summaries high quality by applying Partial Information Decomposition (PID). By treating sentences as sources and summaries as targets, it decomposes the information contributed by sources into union, redundancy, unique, and synergistic components, using a multivariate PID framework. Empirical analysis across several MDS datasets shows that redundancy decreases and unique information increases with more sources, while the order of sources matters, with the first three documents often contributing the majority of unique information. Surprisingly, synergy is negligible in typical MDS datasets but can be dominant in tasks like MultiRC when reframed for synergy analysis, indicating potential signals of joint information requirements or hallucination. The authors release Spider, a tool to quantify these information components, enabling more interpretable future MDS research and dataset construction aligned with human quality.

Abstract

Understanding the nature of high-quality summaries is crucial to further improve the performance of multi-document summarization. We propose an approach to characterize human-written summaries using partial information decomposition, which decomposes the mutual information provided by all source documents into union, redundancy, synergy, and unique information. Our empirical analysis on different MDS datasets shows that there is a direct dependency between the number of sources and their contribution to the summary.

Which Information Matters? Dissecting Human-written Multi-document Summaries with Partial Information Decomposition

TL;DR

The paper investigates what information makes human-written multi-document summaries high quality by applying Partial Information Decomposition (PID). By treating sentences as sources and summaries as targets, it decomposes the information contributed by sources into union, redundancy, unique, and synergistic components, using a multivariate PID framework. Empirical analysis across several MDS datasets shows that redundancy decreases and unique information increases with more sources, while the order of sources matters, with the first three documents often contributing the majority of unique information. Surprisingly, synergy is negligible in typical MDS datasets but can be dominant in tasks like MultiRC when reframed for synergy analysis, indicating potential signals of joint information requirements or hallucination. The authors release Spider, a tool to quantify these information components, enabling more interpretable future MDS research and dataset construction aligned with human quality.

Abstract

Understanding the nature of high-quality summaries is crucial to further improve the performance of multi-document summarization. We propose an approach to characterize human-written summaries using partial information decomposition, which decomposes the mutual information provided by all source documents into union, redundancy, synergy, and unique information. Our empirical analysis on different MDS datasets shows that there is a direct dependency between the number of sources and their contribution to the summary.
Paper Structure (14 sections, 8 equations, 3 figures, 6 tables)

This paper contains 14 sections, 8 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Relationship between the PID components union, redundancy, synergy, and unique information that two sources $X_1$ and $X_2$ provide about a target $Y$. $I(X_1,X_2; Y)$ represents the information that both sources provide jointly about $Y$, whereas $I(X_1;Y)$ and $I(X_2;Y)$ represent the information that each source provides individually.
  • Figure 2: Redundancy (left) and unique (right) information scores across datasets and number of sources. The more sources, the less redundancy and the more unique information contributes to the summary. WCEP scores differ the most from the other datasets. Note that WCEP is extended with additional sources not considered in the summaries.
  • Figure 3: Frequency of each source contributing the most to the summary with their unique information across datasets and total number of sources. The first three sources (blue, orange, and green) contribute the most for any number of sources in all datasets.