Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey

Sourav Verma

Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey

Sourav Verma

TL;DR

This comprehensive work explores the evolution of Contextual Compression paradigms, providing an in-depth examination of the field, and outlines the current challenges and suggest potential research and development directions, paving the way for future advancements in this area.

Abstract

Large Language Models (LLMs) showcase remarkable abilities, yet they struggle with limitations such as hallucinations, outdated knowledge, opacity, and inexplicable reasoning. To address these challenges, Retrieval-Augmented Generation (RAG) has proven to be a viable solution, leveraging external databases to improve the consistency and coherence of generated content, especially valuable for complex, knowledge-rich tasks, and facilitates continuous improvement by leveraging domain-specific insights. By combining the intrinsic knowledge of LLMs with the vast, dynamic repositories of external databases, RAG achieves a synergistic effect. However, RAG is not without its limitations, including a limited context window, irrelevant information, and the high processing overhead for extensive contextual data. In this comprehensive work, we explore the evolution of Contextual Compression paradigms, providing an in-depth examination of the field. Finally, we outline the current challenges and suggest potential research and development directions, paving the way for future advancements in this area.

Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey

TL;DR

Abstract

Paper Structure (33 sections, 10 figures)

This paper contains 33 sections, 10 figures.

Introduction
Methods
Semantic Compression
Context Distillation
Prompting
Efficient Attention Operations
Extrapolation and Interpolation
Context Window Extension
Pre-Trained Language Models (PLMs)
AutoCompressors
LongNET
In-Context Auto-Encoders
RECOMP
Retrievers
LLMChainExtractor
...and 18 more sections

Figures (10)

Figure 1: Taxonomy of Contextual Compression Methods for Large Language Models.
Figure 2: Internalization of step-by-step reasoning via context distillation snell2022learning
Figure 3: Gisting - Each vertical rectangle here represents a stack of Transformer activations mu2024learning
Figure 4: From 11 billion for a tuned model to just 20,480 for a tuned prompt, a reduction of over 5 orders of magnitude lester-etal-2021-power
Figure 5: Overview of LLMLingua-2 wu2024llmlingua2
...and 5 more figures

Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey

TL;DR

Abstract

Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (10)