Table of Contents
Fetching ...

MODS: Moderating a Mixture of Document Speakers to Summarize Debatable Queries in Document Collections

Nishant Balepur, Alexa Siu, Nedim Lipka, Franck Dernoncourt, Tong Sun, Jordan Boyd-Graber, Puneet Mathur

TL;DR

MODS presents Debatable QFS (DQFS), addressing the limitation of traditional query-focused summarization in handling questions with opposing viewpoints. The framework hinges on a panel-like multi-LLM design where documents act as Speaker LLMs and a Moderator LLM orchestrates topic-specific responses, guided by a rich outline that tracks perspectives and stances. The authors introduce DebateQFS and ConflictingQA datasets to evaluate coverage, balance, and faithfulness via pre-hoc citation metrics, showing that MoDS achieves superior topic-paragraph coverage and balanced representation versus strong baselines. Empirical results from both datasets, including human evaluations and ablation studies, demonstrate that content planning through outlines and speaker-controlled interactions substantially improves debatable-query summaries, with practical implications for balanced information synthesis. Limitations include computational cost and the need for broader human validation, while ethical considerations emphasize careful handling of misinformation and user-guided balance in controversial topics.

Abstract

Query-focused summarization (QFS) gives a summary of documents to answer a query. Past QFS work assumes queries have one answer, ignoring debatable ones (Is law school worth it?). We introduce Debatable QFS (DQFS), a task to create summaries that answer debatable queries via documents with opposing perspectives; summaries must comprehensively cover all sources and balance perspectives, favoring no side. These goals elude LLM QFS systems, which: 1) lack structured content plans, failing to guide LLMs to write balanced summaries, and 2) use the same query to retrieve contexts across documents, failing to cover all perspectives specific to each document's content. To overcome this, we design MODS, a multi-LLM framework mirroring human panel discussions. MODS treats documents as individual Speaker LLMs and has a Moderator LLM that picks speakers to respond to tailored queries for planned topics. Speakers use tailored queries to retrieve relevant contexts from their documents and supply perspectives, which are tracked in a rich outline, yielding a content plan to guide the final summary. Experiments on ConflictingQA with controversial web queries and DebateQFS, our new dataset of debate queries from Debatepedia, show MODS beats SOTA by 38-59% in topic paragraph coverage and balance, based on new citation metrics. Users also find MODS's summaries to be readable and more balanced.

MODS: Moderating a Mixture of Document Speakers to Summarize Debatable Queries in Document Collections

TL;DR

MODS presents Debatable QFS (DQFS), addressing the limitation of traditional query-focused summarization in handling questions with opposing viewpoints. The framework hinges on a panel-like multi-LLM design where documents act as Speaker LLMs and a Moderator LLM orchestrates topic-specific responses, guided by a rich outline that tracks perspectives and stances. The authors introduce DebateQFS and ConflictingQA datasets to evaluate coverage, balance, and faithfulness via pre-hoc citation metrics, showing that MoDS achieves superior topic-paragraph coverage and balanced representation versus strong baselines. Empirical results from both datasets, including human evaluations and ablation studies, demonstrate that content planning through outlines and speaker-controlled interactions substantially improves debatable-query summaries, with practical implications for balanced information synthesis. Limitations include computational cost and the need for broader human validation, while ethical considerations emphasize careful handling of misinformation and user-guided balance in controversial topics.

Abstract

Query-focused summarization (QFS) gives a summary of documents to answer a query. Past QFS work assumes queries have one answer, ignoring debatable ones (Is law school worth it?). We introduce Debatable QFS (DQFS), a task to create summaries that answer debatable queries via documents with opposing perspectives; summaries must comprehensively cover all sources and balance perspectives, favoring no side. These goals elude LLM QFS systems, which: 1) lack structured content plans, failing to guide LLMs to write balanced summaries, and 2) use the same query to retrieve contexts across documents, failing to cover all perspectives specific to each document's content. To overcome this, we design MODS, a multi-LLM framework mirroring human panel discussions. MODS treats documents as individual Speaker LLMs and has a Moderator LLM that picks speakers to respond to tailored queries for planned topics. Speakers use tailored queries to retrieve relevant contexts from their documents and supply perspectives, which are tracked in a rich outline, yielding a content plan to guide the final summary. Experiments on ConflictingQA with controversial web queries and DebateQFS, our new dataset of debate queries from Debatepedia, show MODS beats SOTA by 38-59% in topic paragraph coverage and balance, based on new citation metrics. Users also find MODS's summaries to be readable and more balanced.

Paper Structure

This paper contains 30 sections, 2 figures, 21 tables.

Figures (2)

  • Figure 3: Example outline subset from MoDS, which clearly tracks topics, documents, perspectives (facts and stances), and follow-up queries for the user to explore.
  • Figure 4: Distribution of Readability and Balance for Full Summaries and Topic Paragraphs from Prolific.