Table of Contents
Fetching ...

The Case for Intent-Based Query Rewriting

Gianna Lisa Nicolai, Patrick Hansert, Sebastian Michel

TL;DR

INQURE introduces intent-based query rewriting to enable queries over data where original tables are inaccessible or costly, preserving user insights rather than exact results. The system uses a two-phase workflow: Phase 1 filters candidate tables and generates multiple intent-preserving rewrites via an LLM, and Phase 2 prunes non-joinable rewrites, ranks by intent and structure (including MMR with $\lambda=0.7$), and corrects executable rewrites. Evaluation on the Spider benchmark (over 900 schemas) and a user study shows that meaningful rewrites are often possible, with NL-based rewrites performing well and a manageable cost and latency profile under the proposed filtering and ranking regime. The work demonstrates practical potential for querying inaccessible or expensive data sources, highlights trade-offs between LLM cost, latency, and rewrite quality, and points to future directions such as local LLM deployments and deeper integration with analytical frameworks. The findings suggest that intent-based rewriting can enable diverse, insight-preserving queries, while acknowledging challenges in table-filter recall, LLM hallucinations, and the need for scalable, low-latency implementations.

Abstract

With this work, we describe the concept of intent-based query rewriting and present a first viable solution. The aim is to allow rewrites to alter the structure and syntactic outcome of an original query while keeping the obtainable insights intact. This drastically differs from traditional query rewriting, which typically aims to decrease query evaluation time by using strict equivalence rules and optimization heuristics on the query plan. Rewriting queries to queries that only provide a similar insight but otherwise can be entirely different can remedy inaccessible original data tables due to access control, privacy, or expensive data access regarding monetary cost or remote access. In this paper, we put forward INQURE, a system designed for INtent-based QUery REwriting. It uses access to a large language model (LLM) for the query understanding and human-like derivation of alternate queries. Around the LLM, INQURE employs upfront table filtering and subsequent candidate rewrite pruning and ranking. We report on the results of an evaluation using a benchmark set of over 900 database table schemas and discuss the pros and cons of alternate approaches regarding runtime and quality of the rewrites of a user study.

The Case for Intent-Based Query Rewriting

TL;DR

INQURE introduces intent-based query rewriting to enable queries over data where original tables are inaccessible or costly, preserving user insights rather than exact results. The system uses a two-phase workflow: Phase 1 filters candidate tables and generates multiple intent-preserving rewrites via an LLM, and Phase 2 prunes non-joinable rewrites, ranks by intent and structure (including MMR with ), and corrects executable rewrites. Evaluation on the Spider benchmark (over 900 schemas) and a user study shows that meaningful rewrites are often possible, with NL-based rewrites performing well and a manageable cost and latency profile under the proposed filtering and ranking regime. The work demonstrates practical potential for querying inaccessible or expensive data sources, highlights trade-offs between LLM cost, latency, and rewrite quality, and points to future directions such as local LLM deployments and deeper integration with analytical frameworks. The findings suggest that intent-based rewriting can enable diverse, insight-preserving queries, while acknowledging challenges in table-filter recall, LLM hallucinations, and the need for scalable, low-latency implementations.

Abstract

With this work, we describe the concept of intent-based query rewriting and present a first viable solution. The aim is to allow rewrites to alter the structure and syntactic outcome of an original query while keeping the obtainable insights intact. This drastically differs from traditional query rewriting, which typically aims to decrease query evaluation time by using strict equivalence rules and optimization heuristics on the query plan. Rewriting queries to queries that only provide a similar insight but otherwise can be entirely different can remedy inaccessible original data tables due to access control, privacy, or expensive data access regarding monetary cost or remote access. In this paper, we put forward INQURE, a system designed for INtent-based QUery REwriting. It uses access to a large language model (LLM) for the query understanding and human-like derivation of alternate queries. Around the LLM, INQURE employs upfront table filtering and subsequent candidate rewrite pruning and ranking. We report on the results of an evaluation using a benchmark set of over 900 database table schemas and discuss the pros and cons of alternate approaches regarding runtime and quality of the rewrites of a user study.

Paper Structure

This paper contains 26 sections, 11 figures.

Figures (11)

  • Figure 1: Four tables with different information about Berlin districts. Intuitively, if not all are accessible by a user, main insights can be drawn from the others. A case for intent-based query rewriting.
  • Figure 2: Overall workflow consisting of two consecutive phases: (1) Table Filtering and Rewriting and (2) Candidate Cleanup and Execution
  • Figure 3: Precision per Rewriting Approach
  • Figure 4: Precision per Input Query
  • Figure 5: Precision of the Top-Ranked Results by Approach
  • ...and 6 more figures