Reducing Hallucinations in Language Model-based SPARQL Query Generation Using Post-Generation Memory Retrieval
Aditya Sharma, Luis Lara, Christopher J. Pal, Amal Zouaq
TL;DR
This paper tackles hallucinations in LLM-based SPARQL query generation by introducing PGMR, a modular framework that grounds knowledge-graph URIs through a non-parametric memory after an LLM produces an intermediate query with labeled URIs. By transforming SPARQL queries into intermediate forms with starturi/enduri tokens and grounding these labels via a memory-backed retriever, PGMR decouples syntax generation from exact URI selection, mitigating incorrect URIs and improving robustness to out-of-distribution data. Across LCQUAD 2.0 and QALD-10, PGMR achieves substantial gains in Query EM and URI EM while dramatically reducing URI Hallucination, with near-complete elimination on several settings, including unknown URI splits and complex queries. The approach offers a practical path to more reliable KGQA in real-world, untagged data scenarios, leveraging external memory and embedding-based retrieval to constrain LLM outputs to grounded KG elements.
Abstract
The ability to generate SPARQL queries from natural language questions is crucial for ensuring efficient and accurate retrieval of structured data from knowledge graphs (KG). While large language models (LLMs) have been widely adopted for SPARQL query generation, they are often susceptible to hallucinations and out-of-distribution errors when producing KG elements like Uniform Resource Identifiers (URIs) based on internal parametric knowledge. This often results in content that appears plausible but is factually incorrect, posing significant challenges for their use in real-world information retrieval (IR) applications. This has led to increased research aimed at detecting and mitigating such errors. In this paper, we introduce PGMR (Post-Generation Memory Retrieval), a modular framework that incorporates a non-parametric memory module to retrieve KG elements and enhance LLM-based SPARQL query generation. Our experimental results indicate that PGMR consistently delivers strong performance across diverse datasets, data distributions, and LLMs. Notably, PGMR significantly mitigates URI hallucinations, nearly eliminating the problem in several scenarios.
