Topic-Controllable Summarization: Topic-Aware Evaluation and Transformer Methods
Tatiana Passali, Grigorios Tsoumakas
TL;DR
This work tackles topic-controllable summarization by (1) introducing STAS, a topic-aware evaluation metric based on cosine similarity between topic and summary representations, normalised by the dominant topic, and (2) adapting topic control to Transformer architectures via two strategies: topic embeddings and control tokens. It demonstrates that control tokens, especially when combined with prepending and tagging, yield higher topic alignment and faster inference than embedding-based methods, including in zero-shot settings. A synthetic topic-oriented CNN/DailyMail dataset is released to train and evaluate models, and STAS is validated against human judgments with high correlation. The results show substantial improvements in topic focus (STAS) while maintaining competitive ROUGE scores, indicating practical impact for generating topic-focused summaries in real-world applications. The paper also points toward future work on broader controllable attributes, arbitrary-topic tagging, and richer contextual embeddings to further enhance topic-aligned summarization. STAS offers a scalable, interpretable automatic evaluation aligned with user-topic requirements, which is valuable for developers, search engines, and AI chat systems seeking contextually focused summaries.
Abstract
Topic-controllable summarization is an emerging research area with a wide range of potential applications. However, existing approaches suffer from significant limitations. For example, the majority of existing methods built upon recurrent architectures, which can significantly limit their performance compared to more recent Transformer-based architectures, while they also require modifications to the model's architecture for controlling the topic. At the same time, there is currently no established evaluation metric designed specifically for topic-controllable summarization. This work proposes a new topic-oriented evaluation measure to automatically evaluate the generated summaries based on the topic affinity between the generated summary and the desired topic. The reliability of the proposed measure is demonstrated through appropriately designed human evaluation. In addition, we adapt topic embeddings to work with powerful Transformer architectures and propose a novel and efficient approach for guiding the summary generation through control tokens. Experimental results reveal that control tokens can achieve better performance compared to more complicated embedding-based approaches while also being significantly faster.
