Table of Contents
Fetching ...

QirK: Question Answering via Intermediate Representation on Knowledge Graphs

Jan Luca Scheerer, Anton Lykov, Moe Kayali, Ilias Fountalis, Dan Olteanu, Nikolaos Vasiloglou, Dan Suciu

TL;DR

QirK tackles the challenge of answering structurally complex questions on Knowledge Graphs by integrating LLM-based translation to an intermediate representation with FAISS-based semantic search and a database-backed query engine. The system translates natural language queries into an executable query graph, repairs them into valid SPARQL or SQL queries, and evaluates them against a KG stored in PostgreSQL, yielding results with confidence-based ranking. This approach combines the interpretability of explicit query graphs with the broad coverage of LLMs, enabling reliable, transparent, and scalable querying of large, intricate KG schemas like Wikidata. The practical impact is a robust KG QA pipeline that gracefully handles complexity, provides inspection capabilities, and supports bidirectional translation between NL and formal queries.

Abstract

We demonstrate QirK, a system for answering natural language questions on Knowledge Graphs (KG). QirK can answer structurally complex questions that are still beyond the reach of emerging Large Language Models (LLMs). It does so using a unique combination of database technology, LLMs, and semantic search over vector embeddings. The glue for these components is an intermediate representation (IR). The input question is mapped to IR using LLMs, which is then repaired into a valid relational database query with the aid of a semantic search on vector embeddings. This allows a practical synthesis of LLM capabilities and KG reliability. A short video demonstrating QirK is available at https://youtu.be/6c81BLmOZ0U.

QirK: Question Answering via Intermediate Representation on Knowledge Graphs

TL;DR

QirK tackles the challenge of answering structurally complex questions on Knowledge Graphs by integrating LLM-based translation to an intermediate representation with FAISS-based semantic search and a database-backed query engine. The system translates natural language queries into an executable query graph, repairs them into valid SPARQL or SQL queries, and evaluates them against a KG stored in PostgreSQL, yielding results with confidence-based ranking. This approach combines the interpretability of explicit query graphs with the broad coverage of LLMs, enabling reliable, transparent, and scalable querying of large, intricate KG schemas like Wikidata. The practical impact is a robust KG QA pipeline that gracefully handles complexity, provides inspection capabilities, and supports bidirectional translation between NL and formal queries.

Abstract

We demonstrate QirK, a system for answering natural language questions on Knowledge Graphs (KG). QirK can answer structurally complex questions that are still beyond the reach of emerging Large Language Models (LLMs). It does so using a unique combination of database technology, LLMs, and semantic search over vector embeddings. The glue for these components is an intermediate representation (IR). The input question is mapped to IR using LLMs, which is then repaired into a valid relational database query with the aid of a semantic search on vector embeddings. This allows a practical synthesis of LLM capabilities and KG reliability. A short video demonstrating QirK is available at https://youtu.be/6c81BLmOZ0U.
Paper Structure (9 sections, 2 figures)

This paper contains 9 sections, 2 figures.

Figures (2)

  • Figure 1: QirK Architecture. The natural language question is translated into an intermediate representation (IR) with the help of an LLM. Keywords in the IR are resolved to identifiers of semantically matching KG entities and properties to obtain an executable query graph, which is translated into SPARQL or SQL and evaluated using a database system hosting the KG.
  • Figure 2: Snapshot of QirK's user interface. (left) Users can write questions in natural language or in QirK's intermediate representation. QirK's answer is presented in tabular form. (middle) The query graph inferred by QirK from the natural language question is shown for inspection. (right) QirK displays the SPARQL and SQL queries generated from the executable query graph. The queries include identifiers of entities/properties from the underlying KG obtained using the FAISS index.