Table of Contents
Fetching ...

IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization

Jie Cao, Dian Jiao, Qiang Yan, Wenqiao Zhang, Siliang Tang, Yueting Zhuang

TL;DR

IDEAL introduces a two-module framework for query-focused summarization that combines a Query-aware HyperExpert with a Query-focused Infini-attention mechanism to achieve fine-grained, query-conditioned adaptation and long-context processing. The HyperExpert uses a HyperNetwork to generate instance-specific PEFT adapters (LoRA, Prompt, PAdapter) conditioned on the query, while Infini-attention provides memory-compressed, query-guided attention for extremely long documents. Across CovidET, QMsum, and SQuALITY, IDEAL outperforms PEFT baselines and competitive baselines, with ablations confirming the value of query-conditioned parameter generation, HyperExpert configurations, and memory-efficient long-context processing. The work demonstrates strong practical potential for scalable, controllable QFS in real-world applications, while noting the need for book-length dataset validation in future work.

Abstract

Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization. With the advent of large language models (LLMs), shows their impressive capability of textual understanding through large-scale pretraining, which implies the great potential of extractive snippet generation. In this paper, we systematically investigated two indispensable characteristics that the LLMs-based QFS models should be harnessed, Lengthy Document Summarization and Efficiently Fine-grained Query-LLM Alignment, respectively. Correspondingly, we propose two modules called Query-aware HyperExpert and Query-focused Infini-attention to access the aforementioned characteristics. These innovations pave the way for broader application and accessibility in the field of QFS technology. Extensive experiments conducted on existing QFS benchmarks indicate the effectiveness and generalizability of the proposed approach. Our code is publicly available at https://github.com/DCDmllm/IDEAL_Summary.

IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization

TL;DR

IDEAL introduces a two-module framework for query-focused summarization that combines a Query-aware HyperExpert with a Query-focused Infini-attention mechanism to achieve fine-grained, query-conditioned adaptation and long-context processing. The HyperExpert uses a HyperNetwork to generate instance-specific PEFT adapters (LoRA, Prompt, PAdapter) conditioned on the query, while Infini-attention provides memory-compressed, query-guided attention for extremely long documents. Across CovidET, QMsum, and SQuALITY, IDEAL outperforms PEFT baselines and competitive baselines, with ablations confirming the value of query-conditioned parameter generation, HyperExpert configurations, and memory-efficient long-context processing. The work demonstrates strong practical potential for scalable, controllable QFS in real-world applications, while noting the need for book-length dataset validation in future work.

Abstract

Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization. With the advent of large language models (LLMs), shows their impressive capability of textual understanding through large-scale pretraining, which implies the great potential of extractive snippet generation. In this paper, we systematically investigated two indispensable characteristics that the LLMs-based QFS models should be harnessed, Lengthy Document Summarization and Efficiently Fine-grained Query-LLM Alignment, respectively. Correspondingly, we propose two modules called Query-aware HyperExpert and Query-focused Infini-attention to access the aforementioned characteristics. These innovations pave the way for broader application and accessibility in the field of QFS technology. Extensive experiments conducted on existing QFS benchmarks indicate the effectiveness and generalizability of the proposed approach. Our code is publicly available at https://github.com/DCDmllm/IDEAL_Summary.
Paper Structure (44 sections, 13 equations, 7 figures, 11 tables)

This paper contains 44 sections, 13 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: Our Query-aware HyperExperts outperform the corresponding PEFT methods on QFS tasks using a comparable amount of trainable parameters.
  • Figure 2: Overview of IDEAL. We place a regular (non-generated) PEFT Adapter layer in the first $l$ layers, and then use the hidden states of query instruction to generate the Adapter's parameters of the last $N$-$l$ layers.
  • Figure 3: Query-focused Infini-attention has a long-term context memory and a query-focused memory with linear attention for processing infinitely long contexts. $KV_{s-1}$ and $KV_s$ are attention keys and values for previous and current input segments, respectively. $Q$ represents the attention queries for the current input segment, while $Q_{ins}$ refers to the attention queries for the input query instruction. PE signifies position embeddings.
  • Figure 4: t-SNE Visualization of Query-based Parameters' Dynamic Characterizations.
  • Figure 5: A comparison of LoRA and IDEAL$_{LoRA}$ under different training sequence lengths on SQuALITY dataset.
  • ...and 2 more figures