Table of Contents
Fetching ...

MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL

Bing Wang, Changyu Ren, Jian Yang, Xinnian Liang, Jiaqi Bai, LinZheng Chai, Zhao Yan, Qian-Wen Zhang, Di Yin, Xing Sun, Zhoujun Li

TL;DR

MAC-SQL presents a novel multi-agent framework for Text-to-SQL that couples a Selector, Decomposer, and Refiner to handle large databases and multi-step reasoning. By training SQL-Llama (Code Llama 7B) with an Agent-Instruct dataset and validating on BIRD and Spider, the approach achieves state-of-the-art execution accuracy on BIRD and strong generalization to Spider, even close to GPT-4 with a lighter model. The work demonstrates the value of tool-assisted reasoning and modular agent collaboration, and provides open-source resources to democratize high-performance Text-to-SQL capabilities. The combination of external tool usage, few-shot reasoning, and targeted schema reduction offers a scalable path for real-world database querying with natural language.

Abstract

Recent LLM-based Text-to-SQL methods usually suffer from significant performance degradation on "huge" databases and complex user questions that require multi-step reasoning. Moreover, most existing methods neglect the crucial significance of LLMs utilizing external tools and model collaboration. To address these challenges, we introduce MAC-SQL, a novel LLM-based multi-agent collaborative framework. Our framework comprises a core decomposer agent for Text-to-SQL generation with few-shot chain-of-thought reasoning, accompanied by two auxiliary agents that utilize external tools or models to acquire smaller sub-databases and refine erroneous SQL queries. The decomposer agent collaborates with auxiliary agents, which are activated as needed and can be expanded to accommodate new features or tools for effective Text-to-SQL parsing. In our framework, We initially leverage GPT-4 as the strong backbone LLM for all agent tasks to determine the upper bound of our framework. We then fine-tune an open-sourced instruction-followed model, SQL-Llama, by leveraging Code Llama 7B, to accomplish all tasks as GPT-4 does. Experiments show that SQL-Llama achieves a comparable execution accuracy of 43.94, compared to the baseline accuracy of 46.35 for vanilla GPT-4. At the time of writing, MAC-SQL+GPT-4 achieves an execution accuracy of 59.59 when evaluated on the BIRD benchmark, establishing a new state-of-the-art (SOTA) on its holdout test set (https://github.com/wbbeyourself/MAC-SQL).

MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL

TL;DR

MAC-SQL presents a novel multi-agent framework for Text-to-SQL that couples a Selector, Decomposer, and Refiner to handle large databases and multi-step reasoning. By training SQL-Llama (Code Llama 7B) with an Agent-Instruct dataset and validating on BIRD and Spider, the approach achieves state-of-the-art execution accuracy on BIRD and strong generalization to Spider, even close to GPT-4 with a lighter model. The work demonstrates the value of tool-assisted reasoning and modular agent collaboration, and provides open-source resources to democratize high-performance Text-to-SQL capabilities. The combination of external tool usage, few-shot reasoning, and targeted schema reduction offers a scalable path for real-world database querying with natural language.

Abstract

Recent LLM-based Text-to-SQL methods usually suffer from significant performance degradation on "huge" databases and complex user questions that require multi-step reasoning. Moreover, most existing methods neglect the crucial significance of LLMs utilizing external tools and model collaboration. To address these challenges, we introduce MAC-SQL, a novel LLM-based multi-agent collaborative framework. Our framework comprises a core decomposer agent for Text-to-SQL generation with few-shot chain-of-thought reasoning, accompanied by two auxiliary agents that utilize external tools or models to acquire smaller sub-databases and refine erroneous SQL queries. The decomposer agent collaborates with auxiliary agents, which are activated as needed and can be expanded to accommodate new features or tools for effective Text-to-SQL parsing. In our framework, We initially leverage GPT-4 as the strong backbone LLM for all agent tasks to determine the upper bound of our framework. We then fine-tune an open-sourced instruction-followed model, SQL-Llama, by leveraging Code Llama 7B, to accomplish all tasks as GPT-4 does. Experiments show that SQL-Llama achieves a comparable execution accuracy of 43.94, compared to the baseline accuracy of 46.35 for vanilla GPT-4. At the time of writing, MAC-SQL+GPT-4 achieves an execution accuracy of 59.59 when evaluated on the BIRD benchmark, establishing a new state-of-the-art (SOTA) on its holdout test set (https://github.com/wbbeyourself/MAC-SQL).
Paper Structure (37 sections, 5 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 37 sections, 5 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: A complex example of Text-to-SQL. In the Gold SQL, we use SAT_Excellence_Rate to represent "CAST(NumGE1500 AS REAL)/NumTstTakr" for the sake of brevity.
  • Figure 2: The overview of our MAC-SQL framework, which comprises three agents: (i) the Selector, which decomposes a large database into a smaller sub-database to mitigate the interference of irrelevant information, and (ii) the Decomposer, which breaks down a complex question into simpler sub-questions and resolves them progressively by chain-of-thought reasoning, and (iii) the Refiner, which uses an external tool for SQL execution and obtains feedback, then refines faulty SQL queries.
  • Figure 3: The Decomposer Agent Illustration.
  • Figure 4: The Refiner Agent Illustration.
  • Figure 5: Error Distributions of MAC-SQL on dev set of BIRD and Spider.
  • ...and 1 more figures