Table of Contents
Fetching ...

Does This Summary Answer My Question? Modeling Query-Focused Summary Readers with Rational Speech Acts

Cesare Spinoso-Di Piano, Jackie Chi Kit Cheung

TL;DR

The answer reconstruction objective is introduced which approximates a reader's understanding of a summary by their ability to use it to reconstruct the answer to their initial query and is able to re-rank candidate summaries generated by existing QFS systems and select summaries that better align with their corresponding query and reference summary.

Abstract

Query-focused summarization (QFS) is the task of generating a summary in response to a user-written query. Despite its user-oriented nature, there has been limited work in QFS in explicitly considering a user's understanding of a generated summary, potentially causing QFS systems to underperform at inference time. In this paper, we adapt the Rational Speech Act (RSA) framework, a model of human communication, to explicitly model a reader's understanding of a query-focused summary and integrate it within the generation method of existing QFS systems. In particular, we introduce the answer reconstruction objective which approximates a reader's understanding of a summary by their ability to use it to reconstruct the answer to their initial query. Using this objective, we are able to re-rank candidate summaries generated by existing QFS systems and select summaries that better align with their corresponding query and reference summary. More generally, our study suggests that a simple and effective way of improving a language generation system designed for a user-centered task may be to explicitly incorporate its user requirements into the system's generation procedure.

Does This Summary Answer My Question? Modeling Query-Focused Summary Readers with Rational Speech Acts

TL;DR

The answer reconstruction objective is introduced which approximates a reader's understanding of a summary by their ability to use it to reconstruct the answer to their initial query and is able to re-rank candidate summaries generated by existing QFS systems and select summaries that better align with their corresponding query and reference summary.

Abstract

Query-focused summarization (QFS) is the task of generating a summary in response to a user-written query. Despite its user-oriented nature, there has been limited work in QFS in explicitly considering a user's understanding of a generated summary, potentially causing QFS systems to underperform at inference time. In this paper, we adapt the Rational Speech Act (RSA) framework, a model of human communication, to explicitly model a reader's understanding of a query-focused summary and integrate it within the generation method of existing QFS systems. In particular, we introduce the answer reconstruction objective which approximates a reader's understanding of a summary by their ability to use it to reconstruct the answer to their initial query. Using this objective, we are able to re-rank candidate summaries generated by existing QFS systems and select summaries that better align with their corresponding query and reference summary. More generally, our study suggests that a simple and effective way of improving a language generation system designed for a user-centered task may be to explicitly incorporate its user requirements into the system's generation procedure.

Paper Structure

This paper contains 40 sections, 10 equations, 8 figures, 27 tables.

Figures (8)

  • Figure 1: An example of a query-focused summary (truncated) generated by Llama 3 for this article on the movie "A Wrinkle in Time" from the MultiOpEd dataset. The only question which was asked about this article ([QUESTIONS]) was "Is "A Wrinkle in Time" worth watching?". The full prompt we give to Llama 3 can be found in Appendix \ref{['sec:llm_prompts']}.
  • Figure 2: Tradeoff between summarization quality and text quality as controlled by $\lambda$ for the MultiOpEd dataset using Llama 3 with standard sampling (temperature of 2). Solid lines represent summarization quality metrics (ROUGE-1, METEOR, and BERTScore), while dashed lines represent text quality metrics (Comprehensibile, Repetition, and Grammar). Note that the left and right y-axes have different scales for summarization quality and text quality, respectively.
  • Figure 3: Tradeoff between summarization quality and text quality as controlled by $\lambda$ for the MultiOpEd dataset using Llama 3 with standard sampling (temperature of 2). Solid lines represent summarization quality metrics (ROUGE-1, METEOR, and BERTScore), while dashed lines represent text quality metrics (Comprehensibility, Repetition, Grammar and Conciseness). Note that the left and right y-axes have different scales for summarization quality and text quality, respectively.
  • Figure 4: Tradeoff between summarization quality and text quality as controlled by $\lambda$ for the QMSum dataset using Llama 3 with standard sampling (temperature of 2). Solid lines represent summarization quality metrics (ROUGE-1, METEOR, and BERTScore), while dashed lines represent text quality metrics (Comprehensibility, Repetition, Grammar and Conciseness). Note that the left and right y-axes have different scales for summarization quality and text quality, respectively.
  • Figure 5: Tradeoff between summarization quality and text quality as controlled by $\lambda$ for the SQuALITY dataset using Llama 3 with standard sampling (temperature of 2). Solid lines represent summarization quality metrics (ROUGE-1, METEOR, and BERTScore), while dashed lines represent text quality metrics (Comprehensibility, Repetition, Grammar and Conciseness). Note that the left and right y-axes have different scales for summarization quality and text quality, respectively.
  • ...and 3 more figures