Lucy: Think and Reason to Solve Text-to-SQL
Nina Narodytska, Shay Vargaftik
TL;DR
Lucy introduces a three-phase framework that decouples natural-language understanding from automated relational reasoning to solve text-to-SQL on large, complex enterprise databases. By encoding the database schema as a dbModel and using MatchTables to identify relevant objects, GenerateView to synthesize a constraint-respecting view, and QueryView to generate final SQL from that view, Lucy achieves strong zero-shot performance on challenging benchmarks. Experimental results across ACME Insurance, BIRD, and Cloud Resources demonstrate Lucy's superior coverage and competitive execution accuracy compared to zero-shot baselines, while revealing practical failure modes and debugging opportunities. The work advances practical, explainable text-to-SQL for enterprise schemas by combining LLM capabilities with formal reasoning, avoiding fine-tuning while enabling scalable handling of complex database designs.
Abstract
Large Language Models (LLMs) have made significant progress in assisting users to query databases in natural language. While LLM-based techniques provide state-of-the-art results on many standard benchmarks, their performance significantly drops when applied to large enterprise databases. The reason is that these databases have a large number of tables with complex relationships that are challenging for LLMs to reason about. We analyze challenges that LLMs face in these settings and propose a new solution that combines the power of LLMs in understanding questions with automated reasoning techniques to handle complex database constraints. Based on these ideas, we have developed a new framework that outperforms state-of-the-art techniques in zero-shot text-to-SQL on complex benchmarks
