KiRAG: Knowledge-Driven Iterative Retriever for Enhancing Retrieval-Augmented Generation
Jinyuan Fang, Zaiqiao Meng, Craig Macdonald
TL;DR
KiRAG tackles the limitations of retrieval in retrieval-augmented generation for multi-hop QA by grounding iterative retrieval in knowledge triples and integrating reasoning into the retrieval loop. It decomposes documents into <head; relation; tail> triples, builds a stepwise reasoning chain, and uses a Reasoning Chain Aligner to select triples that extend the chain, while a Reasoning Chain Constructor grounds the extension in factual knowledge. A document ranking step ties retrieved triples to source documents, which are then used by a reader to generate answers. Empirically, KiRAG delivers significant improvements over state-of-the-art iRAG models across multiple multi-hop QA datasets, demonstrating strong retrieval quality, robust reasoning-guided retrieval, and practical efficiency through offline triple extraction. The approach shows promise for reliable, adaptable retrieval in complex QA tasks and offers a blueprint for knowledge-grounded iterative retrieval in RAG frameworks.
Abstract
Iterative retrieval-augmented generation (iRAG) models offer an effective approach for multi-hop question answering (QA). However, their retrieval process faces two key challenges: (1) it can be disrupted by irrelevant documents or factually inaccurate chain-of-thoughts; (2) their retrievers are not designed to dynamically adapt to the evolving information needs in multi-step reasoning, making it difficult to identify and retrieve the missing information required at each iterative step. Therefore, we propose KiRAG, which uses a knowledge-driven iterative retriever model to enhance the retrieval process of iRAG. Specifically, KiRAG decomposes documents into knowledge triples and performs iterative retrieval with these triples to enable a factually reliable retrieval process. Moreover, KiRAG integrates reasoning into the retrieval process to dynamically identify and retrieve knowledge that bridges information gaps, effectively adapting to the evolving information needs. Empirical results show that KiRAG significantly outperforms existing iRAG models, with an average improvement of 9.40% in R@3 and 5.14% in F1 on multi-hop QA.
