Familiarity-Aware Evidence Compression for Retrieval-Augmented Generation

Dongwon Jung; Qin Liu; Tenghao Huang; Ben Zhou; Muhao Chen

Familiarity-Aware Evidence Compression for Retrieval-Augmented Generation

Dongwon Jung, Qin Liu, Tenghao Huang, Ben Zhou, Muhao Chen

TL;DR

FaviComp tackles the challenge of integrating multiple retrieved evidence in retrieval-augmented generation by introducing a training-free, inference-time compression that makes evidence more familiar to the target LM. It achieves this via ensemble decoding that blends the compression model and the target model's token probabilities, thereby reducing the target model's perplexity on the compressed context while incorporating its parametric knowledge. Across five open-domain QA datasets, FaviComp outperforms most baselines and even surpasses Gold Compression on at least one multi-document dataset, with optimal performance near an ensemble weight of $\alpha=0.5$. The method is model- and prompt-agnostic, scalable to various RAG pipelines, and demonstrates substantial improvements in accuracy with high compression rates, highlighting practical impact for knowledge-intensive tasks.

Abstract

Retrieval-augmented generation (RAG) improves large language models (LMs) by incorporating non-parametric knowledge through evidence retrieved from external sources. However, it often struggles to cope with inconsistent and irrelevant information that can distract the LM from its tasks, especially when multiple evidence pieces are required. While compressing the retrieved evidence with a compression model aims to address this issue, the compressed evidence may still be unfamiliar to the target model used for downstream tasks, potentially failing to utilize the evidence effectively. We propose FaviComp (Familarity-Aware Evidence Compression), a novel training-free evidence compression technique that makes retrieved evidence more familiar to the target model, while seamlessly integrating parametric knowledge from the model. Experimental results show that FaviComp consistently outperforms most recent evidence compression baselines across multiple open-domain QA datasets, improving accuracy by up to 28.1% while achieving high compression rates. Additionally, we demonstrate the effective integration of both parametric and non-parametric knowledge during evidence compression.

Familiarity-Aware Evidence Compression for Retrieval-Augmented Generation

TL;DR

. The method is model- and prompt-agnostic, scalable to various RAG pipelines, and demonstrates substantial improvements in accuracy with high compression rates, highlighting practical impact for knowledge-intensive tasks.

Abstract

Paper Structure (30 sections, 4 equations, 6 figures, 11 tables)

This paper contains 30 sections, 4 equations, 6 figures, 11 tables.

Introduction
Method
Motivation and Method Overview
RAG with Evidence Compression
Ensemble Decoding for FaviComp
Experimental Settings
Datasets
Implementation Details
Baselines
Experimental Results
Main Results
Impact of Ensemble Coefficient on Performance and Perplexity
Integration of Parametric and Non-parametric Knowledge
Compression Rate Comparisons
Case Study
...and 15 more sections

Figures (6)

Figure 1: An overview of FaviComp. Instead of relying solely on compressed evidence from the compression model (upper), FaviComp familiarizes the compressed evidence to the target model while integrating parametric knowledge through ensemble decoding, resulting in improved downstream performance (lower).
Figure 2: Impact of coefficient $\alpha$ on performance and perplexity when using Llama3.2-3B-Instruct and Llama3-8B-Instruct compression-target pairs.
Figure 3: Accuracy of baselines methods on $\mathrm{Hits=0}$ and $\mathrm{Hits=1}$ subset of multi-document QA datasets.
Figure 4: Accuracy of FaviComp with various $\alpha$ values on $\mathrm{Hits=0}$ and $\mathrm{Hits=1}$ subset of multi-document QA datasets.
Figure 5: Evaluation Prompt Template.
...and 1 more figures

Familiarity-Aware Evidence Compression for Retrieval-Augmented Generation

TL;DR

Abstract

Familiarity-Aware Evidence Compression for Retrieval-Augmented Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)