Table of Contents
Fetching ...

Blar-SQL: Faster, Stronger, Smaller NL2SQL

José Manuel Domínguez, Benjamín Errázuriz, Patricio Daher

TL;DR

This work tackles NL2SQL for large databases by decomposing the problem into two tasks: inferring schema links (Schema-Link) and constructing the final SQL (SQL model). It introduces a schema chunking framework to manage limited context and experiments with two interacting hypotheses (Non-trusting vs Trusting) using two open-source models (Llama-2 for schema linking and Code-Llama for SQL generation). Across Vanilla, zero-shot fine-tune, and decomposed configurations, the approach demonstrates that task division and context-management can yield substantial gains, with non-destructive chunking achieving up to ~46–49% accuracy on the Bird Dev set, approaching GPT-4 performance while using a model 135x smaller, 90x faster, and over 100x cheaper. The results suggest a practical, scalable path for open-source NL2SQL systems, highlighting the importance of careful prompt design and multi-model collaboration to exploit context-limited architectures.

Abstract

Large Language Models (LLMs) have gained considerable notoriety in the field of natural language to SQL tasks (NL2SQL). In this study, we show how task decomposition can greatly benefit LLMs in database understanding and query generation in order to answer human questions with an SQL query. We fined-tuned open source models, specifically Llama-2 and Code Llama, by combining 2 different models each designated to focus on one of two tasks in order to leverage each model's core competency to further increase the accuracy of the final SQL query. We propose a new framework to divide the schema into chunks in order to fit more information into a limited context. Our results are comparable with those obtained by GPT-4 at the same time being 135 times smaller, 90 times faster and more than 100 times cheaper than GPT-4.

Blar-SQL: Faster, Stronger, Smaller NL2SQL

TL;DR

This work tackles NL2SQL for large databases by decomposing the problem into two tasks: inferring schema links (Schema-Link) and constructing the final SQL (SQL model). It introduces a schema chunking framework to manage limited context and experiments with two interacting hypotheses (Non-trusting vs Trusting) using two open-source models (Llama-2 for schema linking and Code-Llama for SQL generation). Across Vanilla, zero-shot fine-tune, and decomposed configurations, the approach demonstrates that task division and context-management can yield substantial gains, with non-destructive chunking achieving up to ~46–49% accuracy on the Bird Dev set, approaching GPT-4 performance while using a model 135x smaller, 90x faster, and over 100x cheaper. The results suggest a practical, scalable path for open-source NL2SQL systems, highlighting the importance of careful prompt design and multi-model collaboration to exploit context-limited architectures.

Abstract

Large Language Models (LLMs) have gained considerable notoriety in the field of natural language to SQL tasks (NL2SQL). In this study, we show how task decomposition can greatly benefit LLMs in database understanding and query generation in order to answer human questions with an SQL query. We fined-tuned open source models, specifically Llama-2 and Code Llama, by combining 2 different models each designated to focus on one of two tasks in order to leverage each model's core competency to further increase the accuracy of the final SQL query. We propose a new framework to divide the schema into chunks in order to fit more information into a limited context. Our results are comparable with those obtained by GPT-4 at the same time being 135 times smaller, 90 times faster and more than 100 times cheaper than GPT-4.
Paper Structure (25 sections, 5 figures, 5 tables)

This paper contains 25 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Example of a Schema Link Error
  • Figure 2: Example of a database content error
  • Figure 3: Prediction solution architecture
  • Figure 4: Schema Link Generation Process
  • Figure 5: Schema chunking process