TRACE the Evidence: Constructing Knowledge-Grounded Reasoning Chains for Retrieval-Augmented Generation
Jinyuan Fang, Zaiqiao Meng, Craig Macdonald
TL;DR
TRACE tackles retrieval noise in retrieval-augmented generation for multi-hop QA by turning retrieved text into a knowledge graph and building knowledge-grounded reasoning chains. It introduces a KG Generator to extract triples per document and an Autoregressive Reasoning Chain Constructor to assemble coherent chains, which can serve directly as context (TRACE-Triple) or guide document selection (TRACE-Doc). Across HotPotQA, 2WikiMultiHopQA, and MuSiQue in zero-shot settings, TRACE yields up to 14.03% EM improvement over using all documents, with chains often providing sufficient context and reducing input length. The work demonstrates that structured, KG-grounded reasoning can enhance the robustness and efficiency of RAG systems, while acknowledging limitations in KG coverage and proposing avenues for quantitative evaluation of KG and chain quality.
Abstract
Retrieval-augmented generation (RAG) offers an effective approach for addressing question answering (QA) tasks. However, the imperfections of the retrievers in RAG models often result in the retrieval of irrelevant information, which could introduce noises and degrade the performance, especially when handling multi-hop questions that require multiple steps of reasoning. To enhance the multi-hop reasoning ability of RAG models, we propose TRACE. TRACE constructs knowledge-grounded reasoning chains, which are a series of logically connected knowledge triples, to identify and integrate supporting evidence from the retrieved documents for answering questions. Specifically, TRACE employs a KG Generator to create a knowledge graph (KG) from the retrieved documents, and then uses an Autoregressive Reasoning Chain Constructor to build reasoning chains. Experimental results on three multi-hop QA datasets show that TRACE achieves an average performance improvement of up to 14.03% compared to using all the retrieved documents. Moreover, the results indicate that using reasoning chains as context, rather than the entire documents, is often sufficient to correctly answer questions.
