CLEAR-KGQA: Clarification-Enhanced Ambiguity Resolution for Knowledge Graph Question Answering
Liqiang Wen, Guanming Xiong, Tong Mo, Bing Li, Weiping Li, Wen Zhao
TL;DR
CLEAR-KBQA addresses real-world ambiguity in knowledge graph question answering by introducing a Bayesian clarification framework that quantifies entity and intent ambiguity and triggers interactive user clarifications. The approach leverages a two-agent, tool-based architecture with SearchNodes, SearchGraphPattern, ExecuteSPARQL, and AskForClarification, guided by a Clarification Plugin and entropy-based scores. Key contributions include a formal problem formulation, a practical interactive process illustrated with a Question Answering Agent and a Dummy User, and an unambiguous dataset constructed from interaction histories. Experimental results on WebQSP and CWQ demonstrate significant performance gains over strong baselines, and the framework yields a refined dataset to advance KGQA research. The work offers practical impact by enabling more robust KGQA systems that operate under realistic, ambiguous query conditions and provides resources to support future exploration of clarification strategies.
Abstract
This study addresses the challenge of ambiguity in knowledge graph question answering (KGQA). While recent KGQA systems have made significant progress, particularly with the integration of large language models (LLMs), they typically assume user queries are unambiguous, which is an assumption that rarely holds in real-world applications. To address these limitations, we propose a novel framework that dynamically handles both entity ambiguity (e.g., distinguishing between entities with similar names) and intent ambiguity (e.g., clarifying different interpretations of user queries) through interactive clarification. Our approach employs a Bayesian inference mechanism to quantify query ambiguity and guide LLMs in determining when and how to request clarification from users within a multi-turn dialogue framework. We further develop a two-agent interaction framework where an LLM-based user simulator enables iterative refinement of logical forms through simulated user feedback. Experimental results on the WebQSP and CWQ dataset demonstrate that our method significantly improves performance by effectively resolving semantic ambiguities. Additionally, we contribute a refined dataset of disambiguated queries, derived from interaction histories, to facilitate future research in this direction.
