MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL
Arian Askari, Christian Poelitz, Xinye Tang
TL;DR
This work tackles self-correction in text-to-SQL by automatically generating tailored guidelines with MAGIC, a three-agent framework that iteratively analyzes initial incorrect SQLs and synthesizes corrective feedback into a usable guideline. The approach, demonstrated on the Spider and BIRD datasets, yields improvements in Execution Accuracy ($EX$) over human-crafted guidelines and existing baselines, and its effectiveness scales with aggregated feedback up to a practical batch size. The manager agent further reduces the number of iterations and enhances the quality of corrected SQLs, while the framework remains applicable to multiple initial text-to-SQL methods without requiring model fine-tuning. The study provides extensive experiments, ablations, and open-source resources to enable reproducibility and further exploration of automated self-correction in in-context learning for text-to-SQL.
Abstract
Self-correction in text-to-SQL is the process of prompting large language model (LLM) to revise its previously incorrectly generated SQL, and commonly relies on manually crafted self-correction guidelines by human experts that are not only labor-intensive to produce but also limited by the human ability in identifying all potential error patterns in LLM responses. We introduce MAGIC, a novel multi-agent method that automates the creation of the self-correction guideline. MAGIC uses three specialized agents: a manager, a correction, and a feedback agent. These agents collaborate on the failures of an LLM-based method on the training set to iteratively generate and refine a self-correction guideline tailored to LLM mistakes, mirroring human processes but without human involvement. Our extensive experiments show that MAGIC's guideline outperforms expert human's created ones. We empirically find out that the guideline produced by MAGIC enhances the interpretability of the corrections made, providing insights in analyzing the reason behind the failures and successes of LLMs in self-correction. All agent interactions are publicly available at https://huggingface.co/datasets/microsoft/MAGIC.
