Aspect-Based Summarization with Self-Aspect Retrieval Enhanced Generation
Yichao Feng, Shuai Zhao, Yueqiu Li, Luwei Xiao, Xiaobao Wu, Anh Tuan Luu
TL;DR
This paper tackles aspect-based summarization by addressing limitations of standard LLM-based methods, such as restricted input length and prompt-driven token usage that fuel hallucinations. It introduces Self-Aspect Retrieval Enhanced Summary Generation (SARESG), a dense-embedding retrieval and pruning framework that extracts content relevant to a given aspect, prunes unrelated text to respect token limits, and uses reranking to sharpen context before generation. Across USB, OAsum, and Ma-news datasets, SARESG consistently improves aspect-aligned summarization metrics (METEOR, ROUGE, BERTScore) and enables effective use of in-context learning via preserved token space. The work highlights the benefits of chunk-level retrieval, analyzes chunk-size effects, and demonstrates that retrieval-guided ICL can yield robust performance across model sizes, while noting limitations in ICL stability and resource demands that warrant future optimization.
Abstract
Aspect-based summarization aims to generate summaries tailored to specific aspects, addressing the resource constraints and limited generalizability of traditional summarization approaches. Recently, large language models have shown promise in this task without the need for training. However, they rely excessively on prompt engineering and face token limits and hallucination challenges, especially with in-context learning. To address these challenges, in this paper, we propose a novel framework for aspect-based summarization: Self-Aspect Retrieval Enhanced Summary Generation. Rather than relying solely on in-context learning, given an aspect, we employ an embedding-driven retrieval mechanism to identify its relevant text segments. This approach extracts the pertinent content while avoiding unnecessary details, thereby mitigating the challenge of token limits. Moreover, our framework optimizes token usage by deleting unrelated parts of the text and ensuring that the model generates output strictly based on the given aspect. With extensive experiments on benchmark datasets, we demonstrate that our framework not only achieves superior performance but also effectively mitigates the token limitation problem.
