Table of Contents
Fetching ...

GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews

Maxime Darrin, Ines Arous, Pablo Piantanida, Jackie CK Cheung

TL;DR

GLIMPSE reframes multi-document summarization of scholarly reviews as a reference-game problem using Rational Speech Act theory to extract both common and unique opinions. It introduces two RSA-based scores—informativeness (pragmatic speaker) and uniqueness (listener divergence from uniform)—to rank candidate utterances and assemble per-review glimpses. Evaluations on ICLR OpenReview data show GLIMPSE delivers more discriminative and concise summaries than traditional baselines while maintaining competitive automatic metrics, though fluency may lag due to targeted, rarer utterances. The approach provides a practical tool for area chairs to quickly understand review consensus and diversity without over-relying on consensus-driven summaries.

Abstract

Scientific peer review is essential for the quality of academic publications. However, the increasing number of paper submissions to conferences has strained the reviewing process. This surge poses a burden on area chairs who have to carefully read an ever-growing volume of reviews and discern each reviewer's main arguments as part of their decision process. In this paper, we introduce \sys, a summarization method designed to offer a concise yet comprehensive overview of scholarly reviews. Unlike traditional consensus-based methods, \sys extracts both common and unique opinions from the reviews. We introduce novel uniqueness scores based on the Rational Speech Act framework to identify relevant sentences in the reviews. Our method aims to provide a pragmatic glimpse into all reviews, offering a balanced perspective on their opinions. Our experimental results with both automatic metrics and human evaluation show that \sys generates more discriminative summaries than baseline methods in terms of human evaluation while achieving comparable performance with these methods in terms of automatic metrics.

GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews

TL;DR

GLIMPSE reframes multi-document summarization of scholarly reviews as a reference-game problem using Rational Speech Act theory to extract both common and unique opinions. It introduces two RSA-based scores—informativeness (pragmatic speaker) and uniqueness (listener divergence from uniform)—to rank candidate utterances and assemble per-review glimpses. Evaluations on ICLR OpenReview data show GLIMPSE delivers more discriminative and concise summaries than traditional baselines while maintaining competitive automatic metrics, though fluency may lag due to targeted, rarer utterances. The approach provides a practical tool for area chairs to quickly understand review consensus and diversity without over-relying on consensus-driven summaries.

Abstract

Scientific peer review is essential for the quality of academic publications. However, the increasing number of paper submissions to conferences has strained the reviewing process. This surge poses a burden on area chairs who have to carefully read an ever-growing volume of reviews and discern each reviewer's main arguments as part of their decision process. In this paper, we introduce \sys, a summarization method designed to offer a concise yet comprehensive overview of scholarly reviews. Unlike traditional consensus-based methods, \sys extracts both common and unique opinions from the reviews. We introduce novel uniqueness scores based on the Rational Speech Act framework to identify relevant sentences in the reviews. Our method aims to provide a pragmatic glimpse into all reviews, offering a balanced perspective on their opinions. Our experimental results with both automatic metrics and human evaluation show that \sys generates more discriminative summaries than baseline methods in terms of human evaluation while achieving comparable performance with these methods in terms of automatic metrics.
Paper Structure (22 sections, 6 equations, 7 figures, 2 tables)

This paper contains 22 sections, 6 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Illustration of our proposed RSA-based scores applied to real-world scholarly reviews. We consider each sentence in a review as a candidate summary. The most common opinions in the reviews are highlighted in blue whereas the unique ones are highlighted in red using our RSA-based scores.
  • Figure 2: Discriminativeness for all the baselines and our methods in extractive mode (GLIMPSE-Unique (Extr.), GLIMPSE-Speaker (Extr.)), and a strong abstractive method (Llama 7b Instruct).
  • Figure 3: Trade-off between discriminativeness of the generated summaries and their fluency (measured as the log-likehood of the summaries under the generative model). The Pareto frontier shows the best trade-off between the two metrics.
  • Figure 4: Instructions given to the annotators to perform the discriminative summarization task.
  • Figure 5: Example of task
  • ...and 2 more figures

Theorems & Definitions (7)

  • Definition 1: The discriminative multi-document summarization problem
  • Definition 2: D-MDS as a reference game
  • Definition 3: Literal listener
  • Definition 4: Pragmatic speaker
  • Definition 5: Pragmatic listener
  • Example 1
  • Example 2