`Keep it Together': Enforcing Cohesion in Extractive Summaries by Simulating Human Memory
Ronald Cardenas, Matthias Galle, Shay B. Cohen
TL;DR
This work tackles the tension between informativeness, redundancy, and cohesion in extractive summarization by introducing a two-stage control framework. It first reduces input redundancy during block-level processing and then balances informativeness and cohesion at sentence selection using a memory-inspired KvD-Select that simulates human lexical chain maintenance. The approach yields summaries with stronger lexical cohesion and smoother topic transitions while maintaining or improving informativeness across diverse domains, as demonstrated by both automatic metrics and human evaluations. The method provides a practical, parameterizable way to generate cohesive multi-sentence extracts, with potential benefits for technical domains where readability and traceability of content are critical.
Abstract
Extractive summaries are usually presented as lists of sentences with no expected cohesion between them. In this paper, we aim to enforce cohesion whilst controlling for informativeness and redundancy in summaries, in cases where the input exhibits high redundancy. The pipeline controls for redundancy in long inputs as it is consumed, and balances informativeness and cohesion during sentence selection. Our sentence selector simulates human memory to keep track of topics --modeled as lexical chains--, enforcing cohesive ties between noun phrases. Across a variety of domains, our experiments revealed that it is possible to extract highly cohesive summaries that nevertheless read as informative to humans as summaries extracted by only accounting for informativeness or redundancy. The extracted summaries exhibit smooth topic transitions between sentences as signaled by lexical chains, with chains spanning adjacent or near-adjacent sentences.
