Table of Contents
Fetching ...

Task-agnostic Prompt Compression with Context-aware Sentence Embedding and Reward-guided Task Descriptor

Barys Liskavets, Shuvendu Roy, Maxim Ushakov, Mark Klibanov, Ali Etemad, Shane Luke

TL;DR

This work addresses the problem of prompt compression for large language models by introducing Task-agnostic Prompt Compression (TPC), a general framework that does not rely on input questions or handcrafted templates. It combines a Context-relevant Task Descriptor (CTD) to generate a task description from the prompt, a context-aware sentence encoder (CSE) to assess sentence relevance, and reinforcement learning to fine-tune the descriptor for informative compression. The authors curate two datasets (CTD and MCQR) to train the descriptor and encoder, and demonstrate that three model sizes (Base, Large, Huge) outperform existing state-of-the-art methods on LongBench and ZeroSCROLLS in both prompt-aware and prompt-agnostic settings, with the smallest model achieving competitive results. The approach offers improved generalization across tasks and domains, and the authors provide plans to release datasets and code to enable reproducibility and further development.

Abstract

The rise of Large Language Models (LLMs) has led to significant interest in prompt compression, a technique aimed at reducing the length of input prompts while preserving critical information. However, the prominent approaches in prompt compression often require explicit questions or handcrafted templates for compression, limiting their generalizability. We propose Task-agnostic Prompt Compression (TPC), a novel framework that generalizes compression across tasks and domains without requiring input questions or templates. TPC generates a context-relevant task description using a task descriptor trained on a curated dataset of context and query pairs, and fine-tuned via reinforcement learning with a reward function designed to capture the most relevant information. The task descriptor is then utilized to compute the relevance of each sentence in the prompt to generate the compressed prompt. We introduce 3 model sizes (Base, Large, and Huge), where the largest model outperforms the existing state-of-the-art methods on LongBench and ZeroSCROLLS benchmarks, and our smallest model performs comparable to the existing solutions while being considerably smaller.

Task-agnostic Prompt Compression with Context-aware Sentence Embedding and Reward-guided Task Descriptor

TL;DR

This work addresses the problem of prompt compression for large language models by introducing Task-agnostic Prompt Compression (TPC), a general framework that does not rely on input questions or handcrafted templates. It combines a Context-relevant Task Descriptor (CTD) to generate a task description from the prompt, a context-aware sentence encoder (CSE) to assess sentence relevance, and reinforcement learning to fine-tune the descriptor for informative compression. The authors curate two datasets (CTD and MCQR) to train the descriptor and encoder, and demonstrate that three model sizes (Base, Large, Huge) outperform existing state-of-the-art methods on LongBench and ZeroSCROLLS in both prompt-aware and prompt-agnostic settings, with the smallest model achieving competitive results. The approach offers improved generalization across tasks and domains, and the authors provide plans to release datasets and code to enable reproducibility and further development.

Abstract

The rise of Large Language Models (LLMs) has led to significant interest in prompt compression, a technique aimed at reducing the length of input prompts while preserving critical information. However, the prominent approaches in prompt compression often require explicit questions or handcrafted templates for compression, limiting their generalizability. We propose Task-agnostic Prompt Compression (TPC), a novel framework that generalizes compression across tasks and domains without requiring input questions or templates. TPC generates a context-relevant task description using a task descriptor trained on a curated dataset of context and query pairs, and fine-tuned via reinforcement learning with a reward function designed to capture the most relevant information. The task descriptor is then utilized to compute the relevance of each sentence in the prompt to generate the compressed prompt. We introduce 3 model sizes (Base, Large, and Huge), where the largest model outperforms the existing state-of-the-art methods on LongBench and ZeroSCROLLS benchmarks, and our smallest model performs comparable to the existing solutions while being considerably smaller.

Paper Structure

This paper contains 25 sections, 6 equations, 6 figures, 3 tables, 3 algorithms.

Figures (6)

  • Figure 1: Comparison of model size versus performance for different prompt compression methods in both prompt-aware and prompt-agnostic setups. Our largest model, TPC-Huge, outperforms all existing methods while maintaining a comparable size to existing solutions. On the other hand, our smallest model, TPC-Base, achieves a competitive performance despite being significantly smaller in size.
  • Figure 2: Illustration of our proposed prompt compression method. The CTD module generates a task description that is relevant to the input context. This description is then utilized by the Context-Aware Sentence Encoder to evaluate the relevance of each sentence in the input prompt, ultimately generating the compressed prompt.
  • Figure 3: Overview of our proposed reward system for refining CTD with RL. (left) A response is generated by the pre-trained LLM with the complete long input prompt. (right) The CTD and CSE modules generate the compressed prompt. KL divergence between the conditional distribution of the generated response from the long prompt and the compressed prompt is used as the reward signal for the RL.
  • Figure 4: A qualitative example of prompt compression with TPC.
  • Figure 5: Sensitivity study on CTD training epochs (left) and the number of RL iterations (right). Here, an RL iteration of 0 indicates no reward-guided refinement training.
  • ...and 1 more figures