Table of Contents
Fetching ...

Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains

Kun Li, Tianhua Zhang, Xixin Wu, Hongyin Luo, James Glass, Helen Meng

TL;DR

This paper proposes graph-aware constrained decoding, in which a constraint derived from the topology of the KG regulates the decoding process of the LLMs, and proposes DoG (Decoding on Graphs), a novel framework that facilitates a deep synergy between LLMs and KGs.

Abstract

Knowledge Graphs (KGs) can serve as reliable knowledge sources for question answering (QA) due to their structured representation of knowledge. Existing research on the utilization of KG for large language models (LLMs) prevalently relies on subgraph retriever or iterative prompting, overlooking the potential synergy of LLMs' step-wise reasoning capabilities and KGs' structural nature. In this paper, we present DoG (Decoding on Graphs), a novel framework that facilitates a deep synergy between LLMs and KGs. We first define a concept, well-formed chain, which consists of a sequence of interrelated fact triplets on the KGs, starting from question entities and leading to answers. We argue that this concept can serve as a principle for making faithful and sound reasoning for KGQA. To enable LLMs to generate well-formed chains, we propose graph-aware constrained decoding, in which a constraint derived from the topology of the KG regulates the decoding process of the LLMs. This constrained decoding method ensures the generation of well-formed chains while making full use of the step-wise reasoning capabilities of LLMs. Based on the above, DoG, a training-free approach, is able to provide faithful and sound reasoning trajectories grounded on the KGs. Experiments across various KGQA tasks with different background KGs demonstrate that DoG achieves superior and robust performance. DoG also shows general applicability with various open-source LLMs.

Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains

TL;DR

This paper proposes graph-aware constrained decoding, in which a constraint derived from the topology of the KG regulates the decoding process of the LLMs, and proposes DoG (Decoding on Graphs), a novel framework that facilitates a deep synergy between LLMs and KGs.

Abstract

Knowledge Graphs (KGs) can serve as reliable knowledge sources for question answering (QA) due to their structured representation of knowledge. Existing research on the utilization of KG for large language models (LLMs) prevalently relies on subgraph retriever or iterative prompting, overlooking the potential synergy of LLMs' step-wise reasoning capabilities and KGs' structural nature. In this paper, we present DoG (Decoding on Graphs), a novel framework that facilitates a deep synergy between LLMs and KGs. We first define a concept, well-formed chain, which consists of a sequence of interrelated fact triplets on the KGs, starting from question entities and leading to answers. We argue that this concept can serve as a principle for making faithful and sound reasoning for KGQA. To enable LLMs to generate well-formed chains, we propose graph-aware constrained decoding, in which a constraint derived from the topology of the KG regulates the decoding process of the LLMs. This constrained decoding method ensures the generation of well-formed chains while making full use of the step-wise reasoning capabilities of LLMs. Based on the above, DoG, a training-free approach, is able to provide faithful and sound reasoning trajectories grounded on the KGs. Experiments across various KGQA tasks with different background KGs demonstrate that DoG achieves superior and robust performance. DoG also shows general applicability with various open-source LLMs.

Paper Structure

This paper contains 20 sections, 3 equations, 2 figures, 8 tables.

Figures (2)

  • Figure 1: An example workflow of DoG with beam size of 1. The input consists of the instruction, three in-context learning examples and the current question, with the full prompt detailed in Tab. \ref{['tab:appendix-prompt']}. The 2-hop input graph is illustrated in (b) and (e), including entities and relations in both solid and dotted lines. Starting from the query entity (white node), the query-centric subgraph $\mathcal{G}_q$ (grey area) is initialized by adding all triplets associated with the query entity, represented with solid lines in (b). The corresponding trie for constrained decoding is shown in (c), maintaining a set of valid tokens $w\in W_{val}$ for each position within Step-1. DoG chooses (Blue Hawaii -> film.film.featured_film_locations -> Hawaii) as Step-1 triplet, branch highlighted in bold within (c). Followed by unconstrained generation, Step-1 result is in (d) with well-formed chain in red and standard decoding in blue. The process advances to Step-2 in (e), where all triplets outside $\mathcal{G}_q$ that involve the two visited entities with boldface are added. The final answer is provided in (f).
  • Figure 2: Performance of Direct Answering, CoT and DoG (beam size = 1) on 2Wikimultihop with 2-hop and >2 hop instances.