Table of Contents
Fetching ...

Filling Memory Gaps: Enhancing Continual Semantic Parsing via SQL Syntax Variance-Guided LLMs without Real Data Replay

Ruiheng Liu, Jinyu Zhang, Yanqi Song, Yu Zhang, Bailong Yang

TL;DR

This work tackles continual semantic parsing without data replay by introducing Lecsp, which first analyzes SQL-syntax gaps across tasks to guide LLMs in reconstructing memory using skeleton-based features. A memory-calibrated generation of pseudo-samples, coupled with a task-aware dual-teacher distillation framework (LLMs as Teacher 1 and the previous student as Teacher 2) enables efficient knowledge accumulation in smaller models. Extensive experiments on Spider-stream-semi and Combined-stream show Lecsp achieving state-of-the-art results, outperforming replay-based and ideal-setup baselines and even exceeding forward transfer upper bounds in some cases. The approach demonstrates strong robustness to cold-start conditions, varying task orders, and SQL-syntax variance, highlighting practical relevance for real-world CSP systems with privacy and resource constraints.

Abstract

Continual Semantic Parsing (CSP) aims to train parsers to convert natural language questions into SQL across tasks with limited annotated examples, adapting to the real-world scenario of dynamically updated databases. Previous studies mitigate this challenge by replaying historical data or employing parameter-efficient tuning (PET), but they often violate data privacy or rely on ideal continual learning settings. To address these problems, we propose a new Large Language Model (LLM)-Enhanced Continuous Semantic Parsing method, named LECSP, which alleviates forgetting while encouraging generalization, without requiring real data replay or ideal settings. Specifically, it first analyzes the commonalities and differences between tasks from the SQL syntax perspective to guide LLMs in reconstructing key memories and improving memory accuracy through a calibration strategy. Then, it uses a task-aware dual-teacher distillation framework to promote the accumulation and transfer of knowledge during sequential training. Experimental results on two CSP benchmarks show that our method significantly outperforms existing methods, even those utilizing data replay or ideal settings. Additionally, we achieve generalization performance beyond the upper limits, better adapting to unseen tasks.

Filling Memory Gaps: Enhancing Continual Semantic Parsing via SQL Syntax Variance-Guided LLMs without Real Data Replay

TL;DR

This work tackles continual semantic parsing without data replay by introducing Lecsp, which first analyzes SQL-syntax gaps across tasks to guide LLMs in reconstructing memory using skeleton-based features. A memory-calibrated generation of pseudo-samples, coupled with a task-aware dual-teacher distillation framework (LLMs as Teacher 1 and the previous student as Teacher 2) enables efficient knowledge accumulation in smaller models. Extensive experiments on Spider-stream-semi and Combined-stream show Lecsp achieving state-of-the-art results, outperforming replay-based and ideal-setup baselines and even exceeding forward transfer upper bounds in some cases. The approach demonstrates strong robustness to cold-start conditions, varying task orders, and SQL-syntax variance, highlighting practical relevance for real-world CSP systems with privacy and resource constraints.

Abstract

Continual Semantic Parsing (CSP) aims to train parsers to convert natural language questions into SQL across tasks with limited annotated examples, adapting to the real-world scenario of dynamically updated databases. Previous studies mitigate this challenge by replaying historical data or employing parameter-efficient tuning (PET), but they often violate data privacy or rely on ideal continual learning settings. To address these problems, we propose a new Large Language Model (LLM)-Enhanced Continuous Semantic Parsing method, named LECSP, which alleviates forgetting while encouraging generalization, without requiring real data replay or ideal settings. Specifically, it first analyzes the commonalities and differences between tasks from the SQL syntax perspective to guide LLMs in reconstructing key memories and improving memory accuracy through a calibration strategy. Then, it uses a task-aware dual-teacher distillation framework to promote the accumulation and transfer of knowledge during sequential training. Experimental results on two CSP benchmarks show that our method significantly outperforms existing methods, even those utilizing data replay or ideal settings. Additionally, we achieve generalization performance beyond the upper limits, better adapting to unseen tasks.

Paper Structure

This paper contains 47 sections, 7 equations, 14 figures, 7 tables, 2 algorithms.

Figures (14)

  • Figure 1: (a) Description of the CSP. (b) The average number of single SQL keywords at cluster centers of different tasks (Section: Inter-Task Memory Completion) in Spider-stream-semi dataset 10.1609/aaai.v37i11.26492. (c) Comparison between our method and others, where additional data required refers to extra historical data or unsupervised data.
  • Figure 2: Memory reconstruction on task $t$ via LLMs. First, domain information in the questions and SQL pairs are removed, which means replacing entity-link results uniformly with symbols such as $\texttt{[COL]}$ and $\texttt{[VAL]}$. Next, $K$-means clustering is used to obtain the set of SQL skeleton $\mathcal{A}^{(t)}$ for task $t$, and the component bias $\Delta\mathcal{A}^{(t)}$ is derived by taking the difference with the saved sets from previous tasks $\mathcal{A}^{(1)},...,\mathcal{A}^{(t-1)}$. Note that the set $\mathcal{A}$ is only related to SQL syntax and does not involve any historical data. Finally, $\mathcal{A}^{(t)}$ and $\Delta\mathcal{A}^{(t)}$ are used to guide intra-task and inter-task memory reconstruction, respectively reinforcing commonalities and filling differences between tasks.
  • Figure 3: The task-aware dual-teacher distillation learning framework of Lecsp.
  • Figure 4: Results (EX) till the seen tasks based on Spider-stream-semi dataset (T5-large).
  • Figure 5: (a) Accuracy of synthetic data execution across different stages. (b) Impact of pseudo sample quantity on different metrics (T5-base).
  • ...and 9 more figures