Table of Contents
Fetching ...

Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation

Kaize Shi, Xueyao Sun, Qing Li, Guandong Xu

TL;DR

The paper addresses hallucinations in LLMs when handling long-tail queries by enriching Retrieval Augmented Generation (RAG) with a semantic compression step. It proposes an AMR-based concept distillation algorithm that converts noisy retrieved context into a compact set of informative concepts via DFS traversal of AMR graphs, and integrates these concepts into a faithfulness-focused prompt for LLM inference. The method is evaluated on open-domain QA datasets PopQA and EntityQuestions, showing improved accuracy and stronger integration as more supporting documents are used, with robustness across backbone LLMs. The work demonstrates that semantic-based context compression can significantly reduce interference from irrelevant information, enabling both larger and smaller LLMs to perform more reliably in long-context RAG settings.

Abstract

Large Language Models (LLMs) have made significant strides in information acquisition. However, their overreliance on potentially flawed parametric knowledge leads to hallucinations and inaccuracies, particularly when handling long-tail, domain-specific queries. Retrieval Augmented Generation (RAG) addresses this limitation by incorporating external, non-parametric knowledge. Nevertheless, the retrieved long-context documents often contain noisy, irrelevant information alongside vital knowledge, negatively diluting LLMs' attention. Inspired by the supportive role of essential concepts in individuals' reading comprehension, we propose a novel concept-based RAG framework with the Abstract Meaning Representation (AMR)-based concept distillation algorithm. The proposed algorithm compresses the cluttered raw retrieved documents into a compact set of crucial concepts distilled from the informative nodes of AMR by referring to reliable linguistic features. The concepts explicitly constrain LLMs to focus solely on vital information in the inference process. We conduct extensive experiments on open-domain question-answering datasets to empirically evaluate the proposed method's effectiveness. The results indicate that the concept-based RAG framework outperforms other baseline methods, particularly as the number of supporting documents increases, while also exhibiting robustness across various backbone LLMs. This emphasizes the distilled concepts are informative for augmenting the RAG process by filtering out interference information. To the best of our knowledge, this is the first work introducing AMR to enhance the RAG, presenting a potential solution to augment inference performance with semantic-based context compression.

Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation

TL;DR

The paper addresses hallucinations in LLMs when handling long-tail queries by enriching Retrieval Augmented Generation (RAG) with a semantic compression step. It proposes an AMR-based concept distillation algorithm that converts noisy retrieved context into a compact set of informative concepts via DFS traversal of AMR graphs, and integrates these concepts into a faithfulness-focused prompt for LLM inference. The method is evaluated on open-domain QA datasets PopQA and EntityQuestions, showing improved accuracy and stronger integration as more supporting documents are used, with robustness across backbone LLMs. The work demonstrates that semantic-based context compression can significantly reduce interference from irrelevant information, enabling both larger and smaller LLMs to perform more reliably in long-context RAG settings.

Abstract

Large Language Models (LLMs) have made significant strides in information acquisition. However, their overreliance on potentially flawed parametric knowledge leads to hallucinations and inaccuracies, particularly when handling long-tail, domain-specific queries. Retrieval Augmented Generation (RAG) addresses this limitation by incorporating external, non-parametric knowledge. Nevertheless, the retrieved long-context documents often contain noisy, irrelevant information alongside vital knowledge, negatively diluting LLMs' attention. Inspired by the supportive role of essential concepts in individuals' reading comprehension, we propose a novel concept-based RAG framework with the Abstract Meaning Representation (AMR)-based concept distillation algorithm. The proposed algorithm compresses the cluttered raw retrieved documents into a compact set of crucial concepts distilled from the informative nodes of AMR by referring to reliable linguistic features. The concepts explicitly constrain LLMs to focus solely on vital information in the inference process. We conduct extensive experiments on open-domain question-answering datasets to empirically evaluate the proposed method's effectiveness. The results indicate that the concept-based RAG framework outperforms other baseline methods, particularly as the number of supporting documents increases, while also exhibiting robustness across various backbone LLMs. This emphasizes the distilled concepts are informative for augmenting the RAG process by filtering out interference information. To the best of our knowledge, this is the first work introducing AMR to enhance the RAG, presenting a potential solution to augment inference performance with semantic-based context compression.
Paper Structure (21 sections, 2 equations, 8 figures, 11 tables, 1 algorithm)

This paper contains 21 sections, 2 equations, 8 figures, 11 tables, 1 algorithm.

Figures (8)

  • Figure 1: The examples of concept-based RAG.
  • Figure 2: The overview of the concept-based RAG framework, which consists of three main components: (a) information retrieval, (b) concept distillation, and (c) concept-based inference.
  • Figure 3: The evaluation results of the $Acc. \uparrow$ trends and $Intg.\uparrow$ on the PopQA dataset. The vertical axis represents $Acc.$, and the horizontal axis represents the number of supporting documents, $\mathcal{K}$. The polyline reflects the changing trend of $Acc.$ with different $\mathcal{K}$, and the under area is $Intg.$
  • Figure 4: The evaluation results of the $Acc. \uparrow$ trends and $Intg.\uparrow$ on the EntityQuestion dataset. The definitions of the axis and symbols are the same with the Fig. \ref{['fig:PopQA']}.
  • Figure : Concept Distillation
  • ...and 3 more figures