Table of Contents
Fetching ...

InteracSPARQL: An Interactive System for SPARQL Query Refinement Using Natural Language Explanations

Xiangru Jian, Zhengyuan Dong, M. Tamer Özsu

TL;DR

InteracSPARQL tackles the difficulty of writing SPARQL queries for non-experts by introducing a two-stage natural language explanation pipeline: a rule-based, AST-derived NLE that is then refined by a language model into a structured JSON explanation. The system supports interactive refinement through direct user feedback or LLM-driven self-refinement, aided by tool-based entity and property lookups to resolve IRIs in real time. Experimental results on QALD benchmarks show substantial improvements in query accuracy and explanation clarity, with human evaluators preferring the structured explanations for completeness and usefulness. Overall, InteracSPARQL demonstrates that combining deterministic, interpretable NLEs with targeted LLM refinements yields a more accessible and robust SPARQL interface for diverse users.

Abstract

In recent years, querying semantic web data using SPARQL has remained challenging, especially for non-expert users, due to the language's complex syntax and the prerequisite of understanding intricate data structures. To address these challenges, we propose InteracSPARQL, an interactive SPARQL query generation and refinement system that leverages natural language explanations (NLEs) to enhance user comprehension and facilitate iterative query refinement. InteracSPARQL integrates LLMs with a rule-based approach to first produce structured explanations directly from SPARQL abstract syntax trees (ASTs), followed by LLM-based linguistic refinements. Users can interactively refine queries through direct feedback or LLM-driven self-refinement, enabling the correction of ambiguous or incorrect query components in real time. We evaluate InteracSPARQL on standard benchmarks, demonstrating significant improvements in query accuracy, explanation clarity, and overall user satisfaction compared to baseline approaches. Our experiments further highlight the effectiveness of combining rule-based methods with LLM-driven refinements to create more accessible and robust SPARQL interfaces.

InteracSPARQL: An Interactive System for SPARQL Query Refinement Using Natural Language Explanations

TL;DR

InteracSPARQL tackles the difficulty of writing SPARQL queries for non-experts by introducing a two-stage natural language explanation pipeline: a rule-based, AST-derived NLE that is then refined by a language model into a structured JSON explanation. The system supports interactive refinement through direct user feedback or LLM-driven self-refinement, aided by tool-based entity and property lookups to resolve IRIs in real time. Experimental results on QALD benchmarks show substantial improvements in query accuracy and explanation clarity, with human evaluators preferring the structured explanations for completeness and usefulness. Overall, InteracSPARQL demonstrates that combining deterministic, interpretable NLEs with targeted LLM refinements yields a more accessible and robust SPARQL interface for diverse users.

Abstract

In recent years, querying semantic web data using SPARQL has remained challenging, especially for non-expert users, due to the language's complex syntax and the prerequisite of understanding intricate data structures. To address these challenges, we propose InteracSPARQL, an interactive SPARQL query generation and refinement system that leverages natural language explanations (NLEs) to enhance user comprehension and facilitate iterative query refinement. InteracSPARQL integrates LLMs with a rule-based approach to first produce structured explanations directly from SPARQL abstract syntax trees (ASTs), followed by LLM-based linguistic refinements. Users can interactively refine queries through direct feedback or LLM-driven self-refinement, enabling the correction of ambiguous or incorrect query components in real time. We evaluate InteracSPARQL on standard benchmarks, demonstrating significant improvements in query accuracy, explanation clarity, and overall user satisfaction compared to baseline approaches. Our experiments further highlight the effectiveness of combining rule-based methods with LLM-driven refinements to create more accessible and robust SPARQL interfaces.

Paper Structure

This paper contains 40 sections, 3 figures, 5 tables, 1 algorithm.

Figures (3)

  • Figure 1: The overview of InteracSPARQL .
  • Figure 2: The proposed pipeline for InteracSPARQL. The input is the raw generation of GPT-4o over the natural language question: What is the TV-show that starred Rowan Atkinson, had 4 seasons and started in 1983?, which is incorrect. The output query is produced by InteracSPARQL (Example \ref{['lst:query_exp']}) and is identical to the ground truth.
  • Figure 3: Human evaluation results for the three conditions. Each subfigure shows mean dimension scores (left) and head-to-head percentages (right) for OD, BQB, NFS and BQFS.

Theorems & Definitions (3)

  • Definition 1
  • Definition 2
  • Definition 3