Don't Forget to Connect! Improving RAG with Graph-based Reranking
Jialin Dong, Bahare Fatemi, Bryan Perozzi, Lin F. Yang, Anton Tsitsulin
TL;DR
The paper tackles the limitation of Retrieval-Augmented Generation (RAG) in ODQA where cross-document connections are underutilized. It introduces G-RAG, a graph-based reranker that builds AMR-informed document graphs and applies a graph neural network to rank retrieved documents, using a pairwise ranking loss and a novel set of tie-aware metrics. Empirical results on Natural Questions and TriviaQA show that G-RAG, especially with RL-based training, outperforms baselines while requiring less computational overhead; zero-shot PaLM 2 as a reranker underperforms, underscoring the importance of dedicated reranking architectures. By integrating cross-document structure and semantic AMR information, G-RAG improves the grounding and relevance of documents fed to the reader, with practical implications for more efficient and accurate ODQA systems.
Abstract
Retrieval Augmented Generation (RAG) has greatly improved the performance of Large Language Model (LLM) responses by grounding generation with context from existing documents. These systems work well when documents are clearly relevant to a question context. But what about when a document has partial information, or less obvious connections to the context? And how should we reason about connections between documents? In this work, we seek to answer these two core questions about RAG generation. We introduce G-RAG, a reranker based on graph neural networks (GNNs) between the retriever and reader in RAG. Our method combines both connections between documents and semantic information (via Abstract Meaning Representation graphs) to provide a context-informed ranker for RAG. G-RAG outperforms state-of-the-art approaches while having smaller computational footprint. Additionally, we assess the performance of PaLM 2 as a reranker and find it to significantly underperform G-RAG. This result emphasizes the importance of reranking for RAG even when using Large Language Models.
