Table of Contents
Fetching ...

Abacus-SQL: A Text-to-SQL System Empowering Cross-Domain and Open-Domain Database Retrieval

Keyan Xu, Dingzirui Wang, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che

TL;DR

Abacus-SQL tackles the limitations of prior text-to-SQL systems by enabling retrieval across open-domain databases and improving cross-domain transferability. It introduces a three-phase pipeline: Preprocess (open-domain retrieval via Murre, demonstration augmentation via Fused, and schema extraction), Multi-Turn Text-to-SQL (prompting, Pre-SQL, Self-Debug), and Presentation (inference visualization and real-time results). The approach is validated on Chase-C, SParC, and CoSQL, with ablations showing that Pre-SQL and Self-Debug improve accuracy, particularly for Chinese queries. The system combines a Streamlit frontend and FastAPI backend, supports remote LLMs, and delivers real-time, transparent query generation, offering practical benefits for cross-domain database querying.

Abstract

The existing text-to-SQL systems have made significant progress in SQL query generation, but they still face numerous challenges. Existing systems often lack retrieval capabilities for open-domain databases, requiring users to manually filter relevant databases. Additionally, their cross-domain transferability is limited, making it challenging to accommodate diverse query requirements. To address these issues, we propose Abacus-SQL. Abacus-SQL utilizes database retrieval technology to accurately locate the required databases in an open-domain database environment. It also enhances the system cross-domain transfer ability through data augmentation methods. Moreover, Abacus-SQL employs Pre-SQL and Self-debug methods, thereby enhancing the accuracy of SQL queries. Experimental results demonstrate that Abacus-SQL performs excellently in multi-turn text-to-SQL tasks, effectively validating the approach's effectiveness. Abacus-SQL is publicly accessible at https://huozi.8wss.com/abacus-sql/.

Abacus-SQL: A Text-to-SQL System Empowering Cross-Domain and Open-Domain Database Retrieval

TL;DR

Abacus-SQL tackles the limitations of prior text-to-SQL systems by enabling retrieval across open-domain databases and improving cross-domain transferability. It introduces a three-phase pipeline: Preprocess (open-domain retrieval via Murre, demonstration augmentation via Fused, and schema extraction), Multi-Turn Text-to-SQL (prompting, Pre-SQL, Self-Debug), and Presentation (inference visualization and real-time results). The approach is validated on Chase-C, SParC, and CoSQL, with ablations showing that Pre-SQL and Self-Debug improve accuracy, particularly for Chinese queries. The system combines a Streamlit frontend and FastAPI backend, supports remote LLMs, and delivers real-time, transparent query generation, offering practical benefits for cross-domain database querying.

Abstract

The existing text-to-SQL systems have made significant progress in SQL query generation, but they still face numerous challenges. Existing systems often lack retrieval capabilities for open-domain databases, requiring users to manually filter relevant databases. Additionally, their cross-domain transferability is limited, making it challenging to accommodate diverse query requirements. To address these issues, we propose Abacus-SQL. Abacus-SQL utilizes database retrieval technology to accurately locate the required databases in an open-domain database environment. It also enhances the system cross-domain transfer ability through data augmentation methods. Moreover, Abacus-SQL employs Pre-SQL and Self-debug methods, thereby enhancing the accuracy of SQL queries. Experimental results demonstrate that Abacus-SQL performs excellently in multi-turn text-to-SQL tasks, effectively validating the approach's effectiveness. Abacus-SQL is publicly accessible at https://huozi.8wss.com/abacus-sql/.

Paper Structure

This paper contains 42 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: The illustration of Abacus-SQL, which consists of three steps: 1. Preprocessing: Retrieves open-domain databases and enhances cross-domain transferability with data augmentation. 2. Multi-turn Text-to-SQL: Improves the accuracy of multi-turn SQL queries using Pre-SQL and Self-debug methods. 3. Presentation: Shows the inference process, SQL queries, and real-time execution results to users.
  • Figure 2: The interface of Abacus-SQL: The sidebar provides various functions, such as uploading and viewing user databases, as well as switching between sessions. The main area facilitates interaction with Abacus-SQL, allowing users to generate SQL queries and execute query results.
  • Figure 3: The prompt used in Multi-Turn text-to-SQL