Table of Contents
Fetching ...

Multi-LLM Text Summarization

Jiangnan Fang, Cheng-Tse Liu, Jieun Kim, Yash Bhedaru, Ethan Liu, Nikhil Singh, Nedim Lipka, Puneet Mathur, Nesreen K. Ahmed, Franck Dernoncourt, Ryan A. Rossi, Hanieh Deilamsalehy

TL;DR

This work presents a dual-topology Multi-LLM summarization framework that combines generation and evaluation rounds to produce high-quality summaries of long documents. It introduces centralized and decentralized interaction schemes, each using chunk-based processing and iterative refinement to integrate diverse model strengths. Across ArXiv and GovReport datasets, the multi-LLM approaches outperform single-LLM baselines by up to 3x, with centralized variants offering robust gains at manageable costs. The findings highlight the practical value of coordinated LLM collaboration for summarization and point to promixim opportunities in prompt design and topology exploration.

Abstract

In this work, we propose a Multi-LLM summarization framework, and investigate two different multi-LLM strategies including centralized and decentralized. Our multi-LLM summarization framework has two fundamentally important steps at each round of conversation: generation and evaluation. These steps are different depending on whether our multi-LLM decentralized summarization is used or centralized. In both our multi-LLM decentralized and centralized strategies, we have k different LLMs that generate diverse summaries of the text. However, during evaluation, our multi-LLM centralized summarization approach leverages a single LLM to evaluate the summaries and select the best one whereas k LLMs are used for decentralized multi-LLM summarization. Overall, we find that our multi-LLM summarization approaches significantly outperform the baselines that leverage only a single LLM by up to 3x. These results indicate the effectiveness of multi-LLM approaches for summarization.

Multi-LLM Text Summarization

TL;DR

This work presents a dual-topology Multi-LLM summarization framework that combines generation and evaluation rounds to produce high-quality summaries of long documents. It introduces centralized and decentralized interaction schemes, each using chunk-based processing and iterative refinement to integrate diverse model strengths. Across ArXiv and GovReport datasets, the multi-LLM approaches outperform single-LLM baselines by up to 3x, with centralized variants offering robust gains at manageable costs. The findings highlight the practical value of coordinated LLM collaboration for summarization and point to promixim opportunities in prompt design and topology exploration.

Abstract

In this work, we propose a Multi-LLM summarization framework, and investigate two different multi-LLM strategies including centralized and decentralized. Our multi-LLM summarization framework has two fundamentally important steps at each round of conversation: generation and evaluation. These steps are different depending on whether our multi-LLM decentralized summarization is used or centralized. In both our multi-LLM decentralized and centralized strategies, we have k different LLMs that generate diverse summaries of the text. However, during evaluation, our multi-LLM centralized summarization approach leverages a single LLM to evaluate the summaries and select the best one whereas k LLMs are used for decentralized multi-LLM summarization. Overall, we find that our multi-LLM summarization approaches significantly outperform the baselines that leverage only a single LLM by up to 3x. These results indicate the effectiveness of multi-LLM approaches for summarization.

Paper Structure

This paper contains 45 sections, 9 equations, 10 figures, 10 tables, 2 algorithms.

Figures (10)

  • Figure 1: Centralized and Decentralized approaches using a 5-LLM example. Similar topologies can be applied to any ("$k$") number of LLMs. In centralized interactions, all models communicate with a central model; in decentralized interactions, each model communicate with every other model and also itself.
  • Figure 2: Prompt for generating the initial summary in the first round.
  • Figure 3: Generation prompt that is used after the initial round of conversation among the multiple LLMs. Note that the above prompt is for generating the final summary, however, for the chunk-level generation, it would just be the actual chunk.
  • Figure 4: Evaluation prompt for evaluating the summaries generated by different LLMs using our conversational (decentralized) multi-LLM framework. "k" is a parameter reflecting the number of LLMs that generate summaries.
  • Figure 5: Evaluation prompt for evaluating the summaries generated using our conversational (centralized) multi-LLM framework. More specifically, we have added an instruction for centralized multi-LLM summarization approach that in addition to providing the best summary, it also outputs the confidence level between 0 and 10. "k" is a parameter reflecting the number of summary-generating LLMs.
  • ...and 5 more figures