Table of Contents
Fetching ...

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

Qingfei Zhao, Ruobing Wang, Yukuo Cen, Daren Zha, Shicheng Tan, Yuxiao Dong, Jie Tang

TL;DR

LongRAG, a general, dual-perspective, and robust LLM-based RAG system paradigm for LCQA to enhance RAG’s understanding of complex long-context knowledge, is proposed as a plug-and-play paradigm, facilitating adaptation to various domains and LLMs.

Abstract

Long-Context Question Answering (LCQA), a challenging task, aims to reason over long-context documents to yield accurate answers to questions. Existing long-context Large Language Models (LLMs) for LCQA often struggle with the "lost in the middle" issue. Retrieval-Augmented Generation (RAG) mitigates this issue by providing external factual evidence. However, its chunking strategy disrupts the global long-context information, and its low-quality retrieval in long contexts hinders LLMs from identifying effective factual details due to substantial noise. To this end, we propose LongRAG, a general, dual-perspective, and robust LLM-based RAG system paradigm for LCQA to enhance RAG's understanding of complex long-context knowledge (i.e., global information and factual details). We design LongRAG as a plug-and-play paradigm, facilitating adaptation to various domains and LLMs. Extensive experiments on three multi-hop datasets demonstrate that LongRAG significantly outperforms long-context LLMs (up by 6.94%), advanced RAG (up by 6.16%), and Vanilla RAG (up by 17.25%). Furthermore, we conduct quantitative ablation studies and multi-dimensional analyses, highlighting the effectiveness of the system's components and fine-tuning strategies. Data and code are available at https://github.com/QingFei1/LongRAG.

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

TL;DR

LongRAG, a general, dual-perspective, and robust LLM-based RAG system paradigm for LCQA to enhance RAG’s understanding of complex long-context knowledge, is proposed as a plug-and-play paradigm, facilitating adaptation to various domains and LLMs.

Abstract

Long-Context Question Answering (LCQA), a challenging task, aims to reason over long-context documents to yield accurate answers to questions. Existing long-context Large Language Models (LLMs) for LCQA often struggle with the "lost in the middle" issue. Retrieval-Augmented Generation (RAG) mitigates this issue by providing external factual evidence. However, its chunking strategy disrupts the global long-context information, and its low-quality retrieval in long contexts hinders LLMs from identifying effective factual details due to substantial noise. To this end, we propose LongRAG, a general, dual-perspective, and robust LLM-based RAG system paradigm for LCQA to enhance RAG's understanding of complex long-context knowledge (i.e., global information and factual details). We design LongRAG as a plug-and-play paradigm, facilitating adaptation to various domains and LLMs. Extensive experiments on three multi-hop datasets demonstrate that LongRAG significantly outperforms long-context LLMs (up by 6.94%), advanced RAG (up by 6.16%), and Vanilla RAG (up by 17.25%). Furthermore, we conduct quantitative ablation studies and multi-dimensional analyses, highlighting the effectiveness of the system's components and fine-tuning strategies. Data and code are available at https://github.com/QingFei1/LongRAG.

Paper Structure

This paper contains 37 sections, 5 equations, 4 figures, 26 tables.

Figures (4)

  • Figure 1: Examples of Different Methods. Long-Context LLMs and Vanilla RAG face "lost in the middle" and "incomplete key information" issues, while LongRAG addresses them, yielding a perfect answer.
  • Figure 2: An overview of LongRAG. Our system involves four sub-components: Hybrid Retriever receives a question and retrieves the top-$k$ most relevant chunks $p_c$; CoT-guided Filter generates global key clues to analyze their relevance one by one, obtaining a set of "True" chunks as $I_d$; Meanwhile, LLM-augmented Information Extractor sequentially maps $p_c$ to the source long-context paragraph $p$ to extract effective global information $I_g$; LLM-augmented Generator promotes knowledge interaction between $I_g$ and $I_d$ to generate the final answer.
  • Figure 3: Trends of token lengths fed into the Generator $\mathcal{G}$ of five component strategies on three datasets.
  • Figure 4: Analysis of the transferability of Extractor&Filter on dataset MusiQue.