BiSparse-AAS: Bilinear Sparse Attention and Adaptive Spans Framework for Scalable and Efficient Text Summarization
Desta Haileselassie Hagos, Legand L. Burge, Anietie Andy, Anis Yazidi, Vladimir Vlassov
TL;DR
BiSparse-AAS introduces a unified, efficient framework for long-sequence text summarization by integrating bilinear attention with sparse attention and adaptive attention spans. By replacing standard self-attention with a learnable bilinear form and dynamically masking and extending attention ranges, the approach achieves near-linear scalability while preserving contextual coherence. Empirical results across CNN/DailyMail, XSum, OpenWebText, and Gigaword report strong ROUGE and semantic metrics, with notable parameter and compute reductions compared to GPT-2-like baselines. The framework serves as a drop-in, resource-friendly solution for extractive and abstractive summarization and offers a scalable path for broader long-sequence NLP tasks.
Abstract
Transformer-based architectures have advanced text summarization, yet their quadratic complexity limits scalability on long documents. This paper introduces BiSparse-AAS (Bilinear Sparse Attention with Adaptive Spans), a novel framework that combines sparse attention, adaptive spans, and bilinear attention to address these limitations. Sparse attention reduces computational costs by focusing on the most relevant parts of the input, while adaptive spans dynamically adjust the attention ranges. Bilinear attention complements both by modeling complex token interactions within this refined context. BiSparse-AAS consistently outperforms state-of-the-art baselines in both extractive and abstractive summarization tasks, achieving average ROUGE improvements of about 68.1% on CNN/DailyMail and 52.6% on XSum, while maintaining strong performance on OpenWebText and Gigaword datasets. By addressing efficiency, scalability, and long-sequence modeling, BiSparse-AAS provides a unified, practical solution for real-world text summarization applications.
