LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations
Ruisheng Cao, Lu Chen, Zhi Chen, Yanbin Zhao, Su Zhu, Kai Yu
TL;DR
This paper tackles the heterogeneous graph encoding challenge in text-to-SQL by introducing LGESQL, which leverages a line graph to explicitly model edge topology and differentiates local and non-local relations during graph iterations. It employs a dual RGAT framework operating on both the original node-centric graph and its line graph, with local edge features dynamically sourced from the line graph and non-local features drawn from a static embedding matrix, plus two schemes (MSDE and MMC) to fuse multi-view information. An auxiliary graph pruning task biases the encoder toward gateway schema items, improving discriminative capability through a multitask objective. On the Spider benchmark, LGESQL achieves state-of-the-art results, notably $62.8\%$ with GloVe and $72.0\%$ with Electra, illustrating the practical impact of edge-centric relational modeling for cross-domain text-to-SQL.
Abstract
This work aims to tackle the challenging heterogeneous graph encoding problem in the text-to-SQL task. Previous methods are typically node-centric and merely utilize different weight matrices to parameterize edge types, which 1) ignore the rich semantics embedded in the topological structure of edges, and 2) fail to distinguish local and non-local relations for each node. To this end, we propose a Line Graph Enhanced Text-to-SQL (LGESQL) model to mine the underlying relational features without constructing meta-paths. By virtue of the line graph, messages propagate more efficiently through not only connections between nodes, but also the topology of directed edges. Furthermore, both local and non-local relations are integrated distinctively during the graph iteration. We also design an auxiliary task called graph pruning to improve the discriminative capability of the encoder. Our framework achieves state-of-the-art results (62.8% with Glove, 72.0% with Electra) on the cross-domain text-to-SQL benchmark Spider at the time of writing.
