Table of Contents
Fetching ...

Chatty-KG: A Multi-Agent AI System for On-Demand Conversational Question Answering over Knowledge Graphs

Reham Omar, Abdelghny Orogat, Ibrahim Abdelaziz, Omij Mangukiya, Panos Kalnis, Essam Mansour

TL;DR

Chatty-KG tackles the challenge of conversational KGQA by unifying retrieval-augmented techniques with structured graph grounding in a modular, training-free multi-agent system. It decomposes the problem into dialogue understanding, entity linking, and SPARQL query generation, with specialized LLM agents coordinating via a LangGraph-based controller and a shared state. Across five real KGs and both single-turn and multi-turn settings, Chatty-KG consistently outperforms state-of-the-art KGQA and conversational baselines, while maintaining low latency and broad LLM compatibility. The approach preserves KG structure, supports evolving graphs without preprocessing, and enables multilingual interaction through a translation extension, highlighting practical impact for enterprise knowledge services.

Abstract

Conversational Question Answering over Knowledge Graphs (KGs) combines the factual grounding of KG-based QA with the interactive nature of dialogue systems. KGs are widely used in enterprise and domain applications to provide structured, evolving, and reliable knowledge. Large language models (LLMs) enable natural and context-aware conversations, but lack direct access to private and dynamic KGs. Retrieval-augmented generation (RAG) systems can retrieve graph content but often serialize structure, struggle with multi-turn context, and require heavy indexing. Traditional KGQA systems preserve structure but typically support only single-turn QA, incur high latency, and struggle with coreference and context tracking. To address these limitations, we propose Chatty-KG, a modular multi-agent system for conversational QA over KGs. Chatty-KG combines RAG-style retrieval with structured execution by generating SPARQL queries through task-specialized LLM agents. These agents collaborate for contextual interpretation, dialogue tracking, entity and relation linking, and efficient query planning, enabling accurate and low-latency translation of natural questions into executable queries. Experiments on large and diverse KGs show that Chatty-KG significantly outperforms state-of-the-art baselines in both single-turn and multi-turn settings, achieving higher F1 and P@1 scores. Its modular design preserves dialogue coherence and supports evolving KGs without fine-tuning or pre-processing. Evaluations with commercial (e.g., GPT-4o, Gemini-2.0) and open-weight (e.g., Phi-4, Gemma 3) LLMs confirm broad compatibility and stable performance. Overall, Chatty-KG unifies conversational flexibility with structured KG grounding, offering a scalable and extensible approach for reliable multi-turn KGQA.

Chatty-KG: A Multi-Agent AI System for On-Demand Conversational Question Answering over Knowledge Graphs

TL;DR

Chatty-KG tackles the challenge of conversational KGQA by unifying retrieval-augmented techniques with structured graph grounding in a modular, training-free multi-agent system. It decomposes the problem into dialogue understanding, entity linking, and SPARQL query generation, with specialized LLM agents coordinating via a LangGraph-based controller and a shared state. Across five real KGs and both single-turn and multi-turn settings, Chatty-KG consistently outperforms state-of-the-art KGQA and conversational baselines, while maintaining low latency and broad LLM compatibility. The approach preserves KG structure, supports evolving graphs without preprocessing, and enables multilingual interaction through a translation extension, highlighting practical impact for enterprise knowledge services.

Abstract

Conversational Question Answering over Knowledge Graphs (KGs) combines the factual grounding of KG-based QA with the interactive nature of dialogue systems. KGs are widely used in enterprise and domain applications to provide structured, evolving, and reliable knowledge. Large language models (LLMs) enable natural and context-aware conversations, but lack direct access to private and dynamic KGs. Retrieval-augmented generation (RAG) systems can retrieve graph content but often serialize structure, struggle with multi-turn context, and require heavy indexing. Traditional KGQA systems preserve structure but typically support only single-turn QA, incur high latency, and struggle with coreference and context tracking. To address these limitations, we propose Chatty-KG, a modular multi-agent system for conversational QA over KGs. Chatty-KG combines RAG-style retrieval with structured execution by generating SPARQL queries through task-specialized LLM agents. These agents collaborate for contextual interpretation, dialogue tracking, entity and relation linking, and efficient query planning, enabling accurate and low-latency translation of natural questions into executable queries. Experiments on large and diverse KGs show that Chatty-KG significantly outperforms state-of-the-art baselines in both single-turn and multi-turn settings, achieving higher F1 and P@1 scores. Its modular design preserves dialogue coherence and supports evolving KGs without fine-tuning or pre-processing. Evaluations with commercial (e.g., GPT-4o, Gemini-2.0) and open-weight (e.g., Phi-4, Gemma 3) LLMs confirm broad compatibility and stable performance. Overall, Chatty-KG unifies conversational flexibility with structured KG grounding, offering a scalable and extensible approach for reliable multi-turn KGQA.

Paper Structure

This paper contains 33 sections, 5 equations, 5 figures, 8 tables, 3 algorithms.

Figures (5)

  • Figure 1: Limitations of current KGQA and chatbot-based systems for real-time conversational access to arbitrary KGs. KGQA systems face issues with contextual understanding, query fragmentation, and latency. Chatbots offer better conversational QA over KGs but require expensive training and preprocessing, limiting adaptability and scalability.
  • Figure 2: Chatty-KG's hierarchical multi-agent architecture. A top-level Chat Agent coordinates two supervised modules: Contextual Understanding and Query Generation & Answer Retrieval, each composed of specialized LLM-powered agents. This design enables modular, low-latency KGQA without training or pre-processing, and improves adaptability and real-time performance.
  • Figure 3: Number of failed questions (i.e., Recall = 0) in each benchmark. Each bar is divided to show failures caused by Question Understanding (QU) and other factors. Shading variations indicate the source of failure. Chatty-KG consistently has the fewest failures across all benchmarks.
  • Figure 4: Average response time per question (in seconds) for KGQAn (K) and Chatty-KG (C). Each bar is segmented bottom-up into three stages: Question Understanding (QU), Linking, and Execution & Filtration (E&F). Shading variations within each bar indicate the contribution of each stage.
  • Figure 5: Average number of queries per question for KGQAn and Chatty-KG. Lower values indicate better query planning, while also Chatty-KG achieves higher F1, see Table \ref{['tab:differentllms']}.

Theorems & Definitions (2)

  • definition 1: Dialogue
  • definition 2: Question Intermediate Representation (QIR)