Table of Contents
Fetching ...

KG-TRACES: Enhancing Large Language Models with Knowledge Graph-constrained Trajectory Reasoning and Attribution Supervision

Rong Wu, Pinlong Cai, Jianbiao Mei, Licheng Wen, Tao Hu, Xuemeng Yang, Daocheng Fu, Botian Shi

TL;DR

KG-TRACES tackles explainability and trust in LLM-based multi-hop reasoning by introducing supervision over symbolic relation paths, full triple paths, and attribution-aware reasoning processes. It operates in KG-available and KG-unavailable conditions, grounding reasoning with retrieved or predicted paths and unifying a multi-task objective that aligns symbolic paths with natural-language explanations. Empirically, it delivers state-of-the-art results on WebQSP and CWQ, transfers to medical-domain QA, and reveals stable, goal-directed reasoning trajectories through visualization. This work advances robust, transparent reasoning in knowledge-intensive tasks and lays groundwork for more trustworthy AI systems that can justify their conclusions with explicit provenance.

Abstract

Large language models (LLMs) have made remarkable strides in various natural language processing tasks, but their performance on complex reasoning problems remains hindered by a lack of explainability and trustworthiness. This issue, often manifesting as hallucinations or unattributable reasoning processes, limits their applicability in complex reasoning scenarios. To address this, we propose Knowledge Graph-constrained Trajectory Reasoning Attribution and Chain Explanation Supervision (KG-TRACES), a novel framework that enhances the reasoning ability of LLMs through explicit supervision over reasoning paths and processes. KG-TRACES jointly supervises the model to: (1) predict symbolic relation paths, (2) predict full triple-level reasoning paths, and (3) generate attribution-aware reasoning processes grounded in the reasoning paths. At inference phase, the model adapts to both KG-available and KG-unavailable scenarios, retrieving reasoning paths from a KG when possible or predicting plausible reasoning paths with only intrinsic knowledge when not. This design enables the model to reason in an explainable and source-attributable pattern. Through extensive experiments on complex reasoning tasks, we demonstrate that KG-TRACES significantly outperforms existing SOTA: it improves Hits@1 by 1.6% and F1 by 4.7% on WebQSP, and achieves improvements of 4.8% in Hits@1 and 2.1% in F1 on CWQ. Moreover, we show its transferability to specialized domains such as medicine. By visualizing the intermediate steps of reasoning processes, we further show that the explicit supervision introduced by KG-TRACES leads to more stable and goal-directed reasoning processes, aligning closely with correct answers. Code is available at https://github.com/Edaizi/KG-TRACES.

KG-TRACES: Enhancing Large Language Models with Knowledge Graph-constrained Trajectory Reasoning and Attribution Supervision

TL;DR

KG-TRACES tackles explainability and trust in LLM-based multi-hop reasoning by introducing supervision over symbolic relation paths, full triple paths, and attribution-aware reasoning processes. It operates in KG-available and KG-unavailable conditions, grounding reasoning with retrieved or predicted paths and unifying a multi-task objective that aligns symbolic paths with natural-language explanations. Empirically, it delivers state-of-the-art results on WebQSP and CWQ, transfers to medical-domain QA, and reveals stable, goal-directed reasoning trajectories through visualization. This work advances robust, transparent reasoning in knowledge-intensive tasks and lays groundwork for more trustworthy AI systems that can justify their conclusions with explicit provenance.

Abstract

Large language models (LLMs) have made remarkable strides in various natural language processing tasks, but their performance on complex reasoning problems remains hindered by a lack of explainability and trustworthiness. This issue, often manifesting as hallucinations or unattributable reasoning processes, limits their applicability in complex reasoning scenarios. To address this, we propose Knowledge Graph-constrained Trajectory Reasoning Attribution and Chain Explanation Supervision (KG-TRACES), a novel framework that enhances the reasoning ability of LLMs through explicit supervision over reasoning paths and processes. KG-TRACES jointly supervises the model to: (1) predict symbolic relation paths, (2) predict full triple-level reasoning paths, and (3) generate attribution-aware reasoning processes grounded in the reasoning paths. At inference phase, the model adapts to both KG-available and KG-unavailable scenarios, retrieving reasoning paths from a KG when possible or predicting plausible reasoning paths with only intrinsic knowledge when not. This design enables the model to reason in an explainable and source-attributable pattern. Through extensive experiments on complex reasoning tasks, we demonstrate that KG-TRACES significantly outperforms existing SOTA: it improves Hits@1 by 1.6% and F1 by 4.7% on WebQSP, and achieves improvements of 4.8% in Hits@1 and 2.1% in F1 on CWQ. Moreover, we show its transferability to specialized domains such as medicine. By visualizing the intermediate steps of reasoning processes, we further show that the explicit supervision introduced by KG-TRACES leads to more stable and goal-directed reasoning processes, aligning closely with correct answers. Code is available at https://github.com/Edaizi/KG-TRACES.

Paper Structure

This paper contains 51 sections, 8 equations, 13 figures, 15 tables.

Figures (13)

  • Figure 1: Comparison of representative reasoning methods in LLMs-based frameworks: (a) Vanilla LLMs, where the model generates responses directly from the question; (b) LLMs + KG-RAG, which uses KG to retrieve relevant subgraph paths to aid the reasoning; (c) LLMs as KG-Retriever: where LLMs is a active retriever, querying KG for relevant information, determining whether sufficient knowledge has be retrieved; (d) KG-TRACES (Ours), which can generate faithful and attributable response based on symbolic subgraph reasoning paths under different KG access conditions.
  • Figure 2: Overview of the KG-TRACES framework. The framework consists of three key components: (a) Data Construction: KG-TRACES integrates original datasets with QA data and symbolic reasoning paths from KG to build the KG-based database and reasoning process database for multi-task learning; (b) Multi-task Learning: Model is trained for two types of tasks: path prediction (including relation paths prediction used for KG retrieval, and full reasoning paths prediction) and faithful reasoning process generation (supervised model produces attributable, interpretable reasoning processes based on symbolic reasoning paths); (c) Multi-conditions Inference: The inference process varies depending on KG availability. With KG access, KG-TRACES predicts relation paths and retrieves whole reasoning paths from KG, generating faithful reasoning process and answer. Without KG access, KG-TRACES predicts reasoning paths relying on intrinsic knowledge to support faithful reasoning process and answer.
  • Figure 3: Example of attributable and explainable reasoning of KG-TRACES.
  • Figure 4: Visualization of model reasoning thoughts for a representative case in WebQSP. Darker color denotes higher reasoning process thoughts distribution density of the region. As reasoning progresses, thoughts distributions become sharper and align more closely with answers. Example:(Question: what year did the LA kings win the cup? Answers: 2012 Stanley Cup Finals, 2014 Stanley Cup Finals.)
  • Figure 5: Visualization of step-wise reasoning process metrics distribution of KG-TRACES in WebQSP
  • ...and 8 more figures