RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases
DongHyun Choi, Myeong Cheol Shin, EungGyun Kim, Dong Ryeol Shin
TL;DR
RYANSQL advances Text-to-SQL in cross-domain settings by recursively constructing nested SQL via a detailed sketch for SELECT statements and a Statement Position Code. Its input encoder combines robust token, column, and table representations with SPC-aware embeddings, while the sketch-based decoder fills structured slots across FROM/SELECT/WHERE/HAVING clauses and supports recursive nesting. The two proposed input-manipulation techniques further boost performance, yielding state-of-the-art results on the Spider benchmark, especially when augmented with a BERT-based encoder. This work highlights the effectiveness of SPC-guided, slot-filled generation for complex, cross-database SQL synthesis and points to improvements in slot-value consistency for future gains.
Abstract
Text-to-SQL is the problem of converting a user question into an SQL query, when the question and database are given. In this paper, we present a neural network approach called RYANSQL (Recursively Yielding Annotation Network for SQL) to solve complex Text-to-SQL tasks for cross-domain databases. State-ment Position Code (SPC) is defined to trans-form a nested SQL query into a set of non-nested SELECT statements; a sketch-based slot filling approach is proposed to synthesize each SELECT statement for its corresponding SPC. Additionally, two input manipulation methods are presented to improve generation performance further. RYANSQL achieved 58.2% accuracy on the challenging Spider benchmark, which is a 3.2%p improvement over previous state-of-the-art approaches. At the time of writing, RYANSQL achieves the first position on the Spider leaderboard.
