Label-Free Topic-Focused Summarization Using Query Augmentation
Wenchuan Mu, Kwan Hui Lim
TL;DR
Topic-focused summarization often requires large labeled datasets and extensive computation. This work introduces Augmented-Query Summarization (AQS), a label-free pipeline that combines paraphrase generation, question answering, hierarchical clustering, and generic abstractive summarization to produce topic-focused summaries from a query and its context. The authors analyze how query and context variations affect QA transferability and demonstrate a training-free method that adapts to new topics without topic-specific training. On real-world data (Debatepedia, QMSum, and ECF), AQS achieves competitive or superior summary quality with favorable efficiency, highlighting its potential for scalable, cost-effective personalized content extraction in data-rich settings.
Abstract
In today's data and information-rich world, summarization techniques are essential in harnessing vast text to extract key information and enhance decision-making and efficiency. In particular, topic-focused summarization is important due to its ability to tailor content to specific aspects of an extended text. However, this usually requires extensive labelled datasets and considerable computational power. This study introduces a novel method, Augmented-Query Summarization (AQS), for topic-focused summarization without the need for extensive labelled datasets, leveraging query augmentation and hierarchical clustering. This approach facilitates the transferability of machine learning models to the task of summarization, circumventing the need for topic-specific training. Through real-world tests, our method demonstrates the ability to generate relevant and accurate summaries, showing its potential as a cost-effective solution in data-rich environments. This innovation paves the way for broader application and accessibility in the field of topic-focused summarization technology, offering a scalable, efficient method for personalized content extraction.
