Table of Contents
Fetching ...

Debate over Mixed-knowledge: A Robust Multi-Agent Reasoning Framework for Incomplete Knowledge Graph Question Answering

Jilong Liu, Pengyang Shao, Wei Qin, Fei Liu, Yonghui Yang, Richang Hong

TL;DR

This work tackles the challenge of incomplete knowledge graphs in KGQA by introducing DoM, a Multi-Agent Debate framework that jointly reasons over structured KG evidence and external textual information through dedicated KG and RAG agents, coordinated by a judge. The approach decomposes questions into sub-questions, iteratively retrieves and integrates evidence, and generates final answers with global consistency. A realistic new dataset, IKGWQ, built from up-to-date knowledge updates, benchmarks DoM under realistic incompleteness scenarios. Empirical results show DoM achieves state-of-the-art performance and improved robustness across diverse backbones and incompleteness levels, highlighting the value of adaptive, multi-source fusion for IKGQA.

Abstract

Knowledge Graph Question Answering (KGQA) aims to improve factual accuracy by leveraging structured knowledge. However, real-world Knowledge Graphs (KGs) are often incomplete, leading to the problem of Incomplete KGQA (IKGQA). A common solution is to incorporate external data to fill knowledge gaps, but existing methods lack the capacity to adaptively and contextually fuse multiple sources, failing to fully exploit their complementary strengths. To this end, we propose Debate over Mixed-knowledge (DoM), a novel framework that enables dynamic integration of structured and unstructured knowledge for IKGQA. Built upon the Multi-Agent Debate paradigm, DoM assigns specialized agents to perform inference over knowledge graphs and external texts separately, and coordinates their outputs through iterative interaction. It decomposes the input question into sub-questions, retrieves evidence via dual agents (KG and Retrieval-Augmented Generation, RAG), and employs a judge agent to evaluate and aggregate intermediate answers. This collaboration exploits knowledge complementarity and enhances robustness to KG incompleteness. In addition, existing IKGQA datasets simulate incompleteness by randomly removing triples, failing to capture the irregular and unpredictable nature of real-world knowledge incompleteness. To address this, we introduce a new dataset, Incomplete Knowledge Graph WebQuestions, constructed by leveraging real-world knowledge updates. These updates reflect knowledge beyond the static scope of KGs, yielding a more realistic and challenging benchmark. Through extensive experiments, we show that DoM consistently outperforms state-of-the-art baselines.

Debate over Mixed-knowledge: A Robust Multi-Agent Reasoning Framework for Incomplete Knowledge Graph Question Answering

TL;DR

This work tackles the challenge of incomplete knowledge graphs in KGQA by introducing DoM, a Multi-Agent Debate framework that jointly reasons over structured KG evidence and external textual information through dedicated KG and RAG agents, coordinated by a judge. The approach decomposes questions into sub-questions, iteratively retrieves and integrates evidence, and generates final answers with global consistency. A realistic new dataset, IKGWQ, built from up-to-date knowledge updates, benchmarks DoM under realistic incompleteness scenarios. Empirical results show DoM achieves state-of-the-art performance and improved robustness across diverse backbones and incompleteness levels, highlighting the value of adaptive, multi-source fusion for IKGQA.

Abstract

Knowledge Graph Question Answering (KGQA) aims to improve factual accuracy by leveraging structured knowledge. However, real-world Knowledge Graphs (KGs) are often incomplete, leading to the problem of Incomplete KGQA (IKGQA). A common solution is to incorporate external data to fill knowledge gaps, but existing methods lack the capacity to adaptively and contextually fuse multiple sources, failing to fully exploit their complementary strengths. To this end, we propose Debate over Mixed-knowledge (DoM), a novel framework that enables dynamic integration of structured and unstructured knowledge for IKGQA. Built upon the Multi-Agent Debate paradigm, DoM assigns specialized agents to perform inference over knowledge graphs and external texts separately, and coordinates their outputs through iterative interaction. It decomposes the input question into sub-questions, retrieves evidence via dual agents (KG and Retrieval-Augmented Generation, RAG), and employs a judge agent to evaluate and aggregate intermediate answers. This collaboration exploits knowledge complementarity and enhances robustness to KG incompleteness. In addition, existing IKGQA datasets simulate incompleteness by randomly removing triples, failing to capture the irregular and unpredictable nature of real-world knowledge incompleteness. To address this, we introduce a new dataset, Incomplete Knowledge Graph WebQuestions, constructed by leveraging real-world knowledge updates. These updates reflect knowledge beyond the static scope of KGs, yielding a more realistic and challenging benchmark. Through extensive experiments, we show that DoM consistently outperforms state-of-the-art baselines.

Paper Structure

This paper contains 37 sections, 8 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Illustration of two strategies for constructing incomplete KG scenarios. (a) Simulated incompleteness by removing triples from the KG; red dashed arrows denote the deleted facts. (b) Realistic incompleteness by incorporating external knowledge; gray dashed arrows indicate newly introduced information. CVT (Compound Value Type) nodes are used to represent multi-entity relations in Freebase.
  • Figure 2: Statistics of knowledge incompleteness and semantic hop distribution in the IKGWQ dataset. The left subfigure shows the number of samples with missing entities and missing relations. The right subfigure presents the distribution of semantic inference hops.
  • Figure 3: Overview of the DoM framework. DoM first decomposes the input question into sub-questions. For each sub-question, the KG Agent and RAG Agent independently infer over structured and unstructured knowledge, and the Judge Agent integrates their outputs through iterative debate. This interaction continues until sufficient evidence is gathered for final answer generation.
  • Figure 4: Performance on CWQ and WebQSP under varying KG incompleteness. CKG denotes a complete KG.
  • Figure 5: Performance of DoM with different retrieval agents on IKGWQ and CWQ. KG-retriever and RAG-retriever involve only the respective retrieval agent, with the Judge Agent reduced to a simple planner. Mixed-retriever activates all agents for full collaboration.
  • ...and 2 more figures