QirK: Question Answering via Intermediate Representation on Knowledge Graphs
Jan Luca Scheerer, Anton Lykov, Moe Kayali, Ilias Fountalis, Dan Olteanu, Nikolaos Vasiloglou, Dan Suciu
TL;DR
QirK tackles the challenge of answering structurally complex questions on Knowledge Graphs by integrating LLM-based translation to an intermediate representation with FAISS-based semantic search and a database-backed query engine. The system translates natural language queries into an executable query graph, repairs them into valid SPARQL or SQL queries, and evaluates them against a KG stored in PostgreSQL, yielding results with confidence-based ranking. This approach combines the interpretability of explicit query graphs with the broad coverage of LLMs, enabling reliable, transparent, and scalable querying of large, intricate KG schemas like Wikidata. The practical impact is a robust KG QA pipeline that gracefully handles complexity, provides inspection capabilities, and supports bidirectional translation between NL and formal queries.
Abstract
We demonstrate QirK, a system for answering natural language questions on Knowledge Graphs (KG). QirK can answer structurally complex questions that are still beyond the reach of emerging Large Language Models (LLMs). It does so using a unique combination of database technology, LLMs, and semantic search over vector embeddings. The glue for these components is an intermediate representation (IR). The input question is mapped to IR using LLMs, which is then repaired into a valid relational database query with the aid of a semantic search on vector embeddings. This allows a practical synthesis of LLM capabilities and KG reliability. A short video demonstrating QirK is available at https://youtu.be/6c81BLmOZ0U.
