Table of Contents
Fetching ...

SQL-Exchange: Transforming SQL Queries Across Domains

Mohammadreza Daviran, Brian Lin, Davood Rafiei

TL;DR

This work introduces SQL-Exchange, a framework for mapping SQL queries across different database schemas by preserving the source query structure while adapting domain-specific elements to align with the target schema, and investigates the conditions under which such mappings are feasible and beneficial.

Abstract

We introduce SQL-Exchange, a framework for mapping SQL queries across different database schemas by preserving the source query structure while adapting domain-specific elements to align with the target schema. We investigate the conditions under which such mappings are feasible and beneficial, and examine their impact on enhancing the in-context learning performance of text-to-SQL systems as a downstream task. Our comprehensive evaluation across multiple model families and benchmark datasets -- assessing structural alignment with source queries, execution validity on target databases, and semantic correctness -- demonstrates that SQL-Exchange is effective across a wide range of schemas and query types. Our results further show that both in-context prompting with mapped queries and fine-tuning on mapped data consistently yield higher text-to-SQL performance than using examples drawn directly from the source schema.

SQL-Exchange: Transforming SQL Queries Across Domains

TL;DR

This work introduces SQL-Exchange, a framework for mapping SQL queries across different database schemas by preserving the source query structure while adapting domain-specific elements to align with the target schema, and investigates the conditions under which such mappings are feasible and beneficial.

Abstract

We introduce SQL-Exchange, a framework for mapping SQL queries across different database schemas by preserving the source query structure while adapting domain-specific elements to align with the target schema. We investigate the conditions under which such mappings are feasible and beneficial, and examine their impact on enhancing the in-context learning performance of text-to-SQL systems as a downstream task. Our comprehensive evaluation across multiple model families and benchmark datasets -- assessing structural alignment with source queries, execution validity on target databases, and semantic correctness -- demonstrates that SQL-Exchange is effective across a wide range of schemas and query types. Our results further show that both in-context prompting with mapped queries and fine-tuning on mapped data consistently yield higher text-to-SQL performance than using examples drawn directly from the source schema.

Paper Structure

This paper contains 40 sections, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Illustration of three natural language queries, their corresponding SQL translations, and their shared SQL skeleton, demonstrating structural similarity across different database schemas.
  • Figure 2: Examples of structural drift and schema leakage in zero-shot mapping using Gemini. In the first case, schema elements (in red)—such as column names and literals—are inappropriately copied from the source into the target query (e.g., no price column in target). In the second case, the target query omits a necessary JOIN clause from the source, leading to a loss of structural fidelity.
  • Figure 3: Frequency of structural changes, broken down to SQL constructs, for the two benchmarks (BIRD, SPIDER) and two LLMs (Gemini-1.5-flash, GPT-4o-mini). Each bar is stacked by deletions, insertions, and mutations. Since a single query may involve multiple types of edits across different clauses, bucket totals may exceed the number of modified queries. This visualization reveals which parts of the SQL skeleton are most frequently altered during schema adaptation. Note that the charts are not on the same scale, and the dataset used for the SPIDER–Gemini setting is larger than that for SPIDER–GPT.
  • Figure 4: Representative examples of structural edits during query mapping by SQL-Exchange. (A) Introduction of the STRFTIME function to enable temporal filtering on string-based timestamps; (B) Removal of a join when the target schema supports a direct column-based formulation of the counting logic. Examples are drawn from the SPIDER and BIRD training sets and mapped to target schemas from their respective development sets.
  • Figure 5: Comparison of SQL-Exchange and Zero-Shot prompting across semantic and structural metrics.
  • ...and 2 more figures