Table of Contents
Fetching ...

ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots

Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour

TL;DR

Addressing knowledge-graph chatbots, the paper compares conversational LMs (ChatGPT, Galactica) with traditional KG QA systems (KGQAn, EDGQA) using four real KGs and 450 questions. It proposes a unified framework with seven evaluation criteria (correctness, robustness, determinism, explainability, question understanding, recent information, and cross-domain generality) and a fairness-focused manual assessment. Results show language models excel on general-domain KGs but lag on academic graphs, while KGQAn maintains strong, domain-general performance and supports up-to-date KG data; hybrid designs emerge as a promising path. The work also identifies open challenges—dialogue management, explainability, and up-to-date information integration—and provides benchmarks and guidance for developing next-generation KG chatbots.

Abstract

Conversational AI and Question-Answering systems (QASs) for knowledge graphs (KGs) are both emerging research areas: they empower users with natural language interfaces for extracting information easily and effectively. Conversational AI simulates conversations with humans; however, it is limited by the data captured in the training datasets. In contrast, QASs retrieve the most recent information from a KG by understanding and translating the natural language question into a formal query supported by the database engine. In this paper, we present a comprehensive study of the characteristics of the existing alternatives towards combining both worlds into novel KG chatbots. Our framework compares two representative conversational models, ChatGPT and Galactica, against KGQAN, the current state-of-the-art QAS. We conduct a thorough evaluation using four real KGs across various application domains to identify the current limitations of each category of systems. Based on our findings, we propose open research opportunities to empower QASs with chatbot capabilities for KGs. All benchmarks and all raw results are available1 for further analysis.

ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots

TL;DR

Addressing knowledge-graph chatbots, the paper compares conversational LMs (ChatGPT, Galactica) with traditional KG QA systems (KGQAn, EDGQA) using four real KGs and 450 questions. It proposes a unified framework with seven evaluation criteria (correctness, robustness, determinism, explainability, question understanding, recent information, and cross-domain generality) and a fairness-focused manual assessment. Results show language models excel on general-domain KGs but lag on academic graphs, while KGQAn maintains strong, domain-general performance and supports up-to-date KG data; hybrid designs emerge as a promising path. The work also identifies open challenges—dialogue management, explainability, and up-to-date information integration—and provides benchmarks and guidance for developing next-generation KG chatbots.

Abstract

Conversational AI and Question-Answering systems (QASs) for knowledge graphs (KGs) are both emerging research areas: they empower users with natural language interfaces for extracting information easily and effectively. Conversational AI simulates conversations with humans; however, it is limited by the data captured in the training datasets. In contrast, QASs retrieve the most recent information from a KG by understanding and translating the natural language question into a formal query supported by the database engine. In this paper, we present a comprehensive study of the characteristics of the existing alternatives towards combining both worlds into novel KG chatbots. Our framework compares two representative conversational models, ChatGPT and Galactica, against KGQAN, the current state-of-the-art QAS. We conduct a thorough evaluation using four real KGs across various application domains to identify the current limitations of each category of systems. Based on our findings, we propose open research opportunities to empower QASs with chatbot capabilities for KGs. All benchmarks and all raw results are available1 for further analysis.
Paper Structure (17 sections, 4 equations, 5 tables)