CHESS: Contextual Harnessing for Efficient SQL Synthesis

Shayan Talaei; Mohammadreza Pourreza; Yu-Chen Chang; Azalia Mirhoseini; Amin Saberi

CHESS: Contextual Harnessing for Efficient SQL Synthesis

Shayan Talaei, Mohammadreza Pourreza, Yu-Chen Chang, Azalia Mirhoseini, Amin Saberi

TL;DR

CHESS addresses the challenging text-to-SQL problem in real-world databases by decomposing the task into entity/context retrieval, schema selection, and query generation within an end-to-end LLM-driven pipeline. It introduces a four-agent framework for retrieval, pruning, generation, and validation, coupled with preprocessing (LSH and vector databases) to handle large catalogs with minimal necessary context. The method demonstrates state-of-the-art performance among disclosed methods on the BIRD benchmark and strong results on Spider, with an open-source variant achieving competitive accuracy and preserving data privacy. Key contributions include a scalable hierarchical retrieval strategy, a three-stage schema pruning protocol, a fine-tuned candidate generator with noise-injected data, and a revision loop guided by execution feedback and error analysis. CHESS shows substantial practical potential for industrial deployment by reducing token usage, preserving privacy, and delivering high-quality SQL synthesis across diverse domains.

Abstract

Translating natural language questions into SQL queries, known as text-to-SQL, is a long-standing research problem. Effective text-to-SQL synthesis can become very challenging due to (i) the extensive size of database catalogs (descriptions of tables and their columns) and database values, (ii) reasoning over large database schemas, (iii) ensuring the functional validity of the generated queries, and (iv) navigating the ambiguities of natural language questions. We introduce CHESS, a Large Language Model (LLM) based multi-agent framework for efficient and scalable SQL synthesis, comprising four specialized agents, each targeting one of the aforementioned challenges: the Information Retriever (IR) extracts relevant data, the Schema Selector (SS) prunes large schemas, the Candidate Generator (CG) generates high-quality candidates and refines queries iteratively, and the Unit Tester (UT) validates queries through LLM-based natural language unit tests. Our framework offers configurable features that adapt to various deployment constraints, including 1) Supporting industrial-scale databases: leveraging the Schema Selector agent, CHESS efficiently narrows down very large database schemas into manageable sub-schemas, boosting system accuracy by approximately $2\%$ and reducing the number of LLM tokens by $\times 5$. 2) State-of-the-Art privacy-preserving performance: Among the methods using open-source models, CHESS achieves state-of-the-art performance, resulting in a high-performing, privacy-preserving system suitable for industrial deployment. 3) Scalablity with additional compute budget: In settings with high computational budgets, CHESS achieves $71.10\%$ accuracy on the BIRD test set, within $2\%$ of the leading proprietary method, while requiring approximately $83\%$ fewer LLM calls.

CHESS: Contextual Harnessing for Efficient SQL Synthesis

TL;DR

Abstract

and reducing the number of LLM tokens by

. 2) State-of-the-Art privacy-preserving performance: Among the methods using open-source models, CHESS achieves state-of-the-art performance, resulting in a high-performing, privacy-preserving system suitable for industrial deployment. 3) Scalablity with additional compute budget: In settings with high computational budgets, CHESS achieves

accuracy on the BIRD test set, within

of the leading proprietary method, while requiring approximately

fewer LLM calls.

Paper Structure (62 sections, 15 figures, 7 tables)

This paper contains 62 sections, 15 figures, 7 tables.

Introduction
Related Work
Methodology
Entity and Context Retrieval
Keyword Extraction.
Entity Retrieval.
Context Retrieval.
Schema Selection
Individual Column Filtering.
Table Selection.
Final Column Selection.
Query Generation
Candidate Generation.
Revision.
Preprocessing
...and 47 more sections

Figures (15)

Figure 1: Example of challenges in text-to-SQL translation. 1) Questions passed by the users might not have the exact database value. 2) Column names might not be a good representation of what they store so using database catalogs is an essential part of text-to-SQL translation. 3) For a given question there are multiple ways of writing a correct SQL answer.
Figure 2: Our pipeline with modules for entity and context retrieval, schema selection, and query generation.
Figure 3: An example of the revise tool to fix missing columns in a candidate query.
Figure 4: Template for the extract_keyword tool
Figure 5: Template for the select_tables tool.
...and 10 more figures

CHESS: Contextual Harnessing for Efficient SQL Synthesis

TL;DR

Abstract

CHESS: Contextual Harnessing for Efficient SQL Synthesis

Authors

TL;DR

Abstract

Table of Contents

Figures (15)