Chain of Summaries: Summarization Through Iterative Questioning
William Brach, Lukas Galke Poech
TL;DR
CoS addresses the challenge of making web content accessible to LLMs by generating information-dense, plain-text summaries through an iterative, question-guided refinement inspired by Hegel's dialectic. The method uses an initial thesis, then antithesis via synthetic questions, and a synthesis that best supports downstream QA, iterating to maximize generalization while minimizing token usage. Empirical results on TriviaQA, TruthfulQA, and SQuAD show CoS outperforms zero-shot baselines and specialized summarizers, often surpassing full-source QA performance while using far fewer tokens. The approach is model-agnostic and suitable for server-side deployment as a content cache with human oversight and synthetic QA generation when labeled data are scarce.
Abstract
Large Language Models (LLMs) are increasingly using external web content. However, much of this content is not easily digestible by LLMs due to LLM-unfriendly formats and limitations of context length. To address this issue, we propose a method for generating general-purpose, information-dense summaries that act as plain-text repositories of web content. Inspired by Hegel's dialectical method, our approach, denoted as Chain of Summaries (CoS), iteratively refines an initial summary (thesis) by identifying its limitations through questioning (antithesis), leading to a general-purpose summary (synthesis) that can satisfy current and anticipate future information needs. Experiments on the TriviaQA, TruthfulQA, and SQUAD datasets demonstrate that CoS outperforms zero-shot LLM baselines by up to 66% and specialized summarization methods such as BRIO and PEGASUS by up to 27%. CoS-generated summaries yield higher Q&A performance compared to the source content, while requiring substantially fewer tokens and being agnostic to the specific downstream LLM. CoS thus resembles an appealing option for website maintainers to make their content more accessible for LLMs, while retaining possibilities for human oversight.
