KnowTrace: Bootstrapping Iterative Retrieval-Augmented Generation with Structured Knowledge Tracing
Rui Li, Quanyu Dai, Zeyu Zhang, Xu Chen, Zhenhua Dong, Ji-Rong Wen
TL;DR
The paper tackles context overload and non-contributive steps in iterative retrieval-augmented generation for multi-hop QA. It introduces KnowTrace, which uses structured knowledge tracing to progressively build a question-specific knowledge graph $\mathcal{G}_q$ through two LLM-driven phases—knowledge exploration and knowledge completion—guided by prompts, and then employs a reflective backtracing mechanism to distill high-quality supervision data for self-taught finetuning. A backtracing loop identifies a supporting subgraph $\mathcal{S}_q$ to filter out non-contributive generations, enabling bootstrapped improvements through iterative finetuning. Empirical results across HotpotQA, 2WikiMultihopQA, and MuSiQue show KnowTrace consistently surpassing baselines, with additional gains from the bootstrapped version, while maintaining competitive efficiency and demonstrating robustness to different retrievers and prompting strategies.
Abstract
Recent advances in retrieval-augmented generation (RAG) furnish large language models (LLMs) with iterative retrievals of relevant information to handle complex multi-hop questions. These methods typically alternate between LLM reasoning and retrieval to accumulate external information into the LLM's context. However, the ever-growing context inherently imposes an increasing burden on the LLM to perceive connections among critical information pieces, with futile reasoning steps further exacerbating this overload issue. In this paper, we present KnowTrace, an elegant RAG framework to (1) mitigate the context overload and (2) bootstrap higher-quality multi-step reasoning. Instead of simply piling the retrieved contents, KnowTrace autonomously traces out desired knowledge triplets to organize a specific knowledge graph relevant to the input question. Such a structured workflow not only empowers the LLM with an intelligible context for inference, but also naturally inspires a reflective mechanism of knowledge backtracing to identify contributive LLM generations as process supervision data for self-bootstrapping. Extensive experiments show that KnowTrace consistently surpasses existing methods across three multi-hop question answering benchmarks, and the bootstrapped version further amplifies the gains.
