Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering
Yao Xu, Shizhu He, Jiabei Chen, Zihao Wang, Yangqiu Song, Hanghang Tong, Guang Liu, Kang Liu, Jun Zhao
TL;DR
<3-5 sentence high-level summary> The paper introduces Incomplete Knowledge Graph QA (IKGQA) to reflect real-world KG limitations and proposes Generate-on-Graph (GoG), a training-free framework that uses a Thinking-Searching-Generating loop to let LLMs act as both agents navigating KGs and as generators of missing triples. By constructing two IKGQA datasets from WebQSP and CWQ with controlled incompleteness, the authors demonstrate that GoG can outperform prior Semantic Parsing and Retrieval Augmented methods, especially when crucial triples are omitted. GoG’s ability to generate and verify new factual triples while leveraging KG context enables stronger integration of internal LLM knowledge with external KG evidence, and its robustness is shown across multiple LLM backbones and incompleteness levels. The work highlights practical implications for more reliable LLM-KG QA in open-world settings and outlines avenues to mitigate hallucination and further improve performance.
Abstract
To address the issues of insufficient knowledge and hallucination in Large Language Models (LLMs), numerous studies have explored integrating LLMs with Knowledge Graphs (KGs). However, these methods are typically evaluated on conventional Knowledge Graph Question Answering (KGQA) with complete KGs, where all factual triples required for each question are entirely covered by the given KG. In such cases, LLMs primarily act as an agent to find answer entities within the KG, rather than effectively integrating the internal knowledge of LLMs and external knowledge sources such as KGs. In fact, KGs are often incomplete to cover all the knowledge required to answer questions. To simulate these real-world scenarios and evaluate the ability of LLMs to integrate internal and external knowledge, we propose leveraging LLMs for QA under Incomplete Knowledge Graph (IKGQA), where the provided KG lacks some of the factual triples for each question, and construct corresponding datasets. To handle IKGQA, we propose a training-free method called Generate-on-Graph (GoG), which can generate new factual triples while exploring KGs. Specifically, GoG performs reasoning through a Thinking-Searching-Generating framework, which treats LLM as both Agent and KG in IKGQA. Experimental results on two datasets demonstrate that our GoG outperforms all previous methods.
