Intent-driven In-context Learning for Few-shot Dialogue State Tracking
Zihao Yi, Zhe Xu, Ying Shen
TL;DR
The paper tackles few-shot dialogue state tracking by addressing implicit user intents and noisy dialogue data. It introduces IDIC-DST, which uses Intent-driven Dialogue Information Augmentation to extract current user intent with a fine-tuned T5 and augment context, and Intent-driven Examples Retrieval to mask noise, rewrite inputs, and retrieve top-$k$ in-context examples via a SBERT-based retriever. Dialogue state updates are performed by a pre-trained LLM through a SQL-based paradigm, where $SQL_t = \mathrm{PLM}(P_t)$ and $S_t = \mathrm{sql}(SQL_t,B_{t-1})$, enabling dynamic state changes. Experiments on MultiWOZ 2.1 and 2.4 show state-of-the-art performance in 1% few-shot settings, with ablations confirming the contributions of intent-driven augmentation and example retrieval. These results suggest IDIC-DST can significantly reduce data requirements while maintaining high DST accuracy for real-world TOD systems.
Abstract
Dialogue state tracking (DST) plays an essential role in task-oriented dialogue systems. However, user's input may contain implicit information, posing significant challenges for DST tasks. Additionally, DST data includes complex information, which not only contains a large amount of noise unrelated to the current turn, but also makes constructing DST datasets expensive. To address these challenges, we introduce Intent-driven In-context Learning for Few-shot DST (IDIC-DST). By extracting user's intent, we propose an Intent-driven Dialogue Information Augmentation module to augment the dialogue information, which can track dialogue states more effectively. Moreover, we mask noisy information from DST data and rewrite user's input in the Intent-driven Examples Retrieval module, where we retrieve similar examples. We then utilize a pre-trained large language model to update the dialogue state using the augmented dialogue information and examples. Experimental results demonstrate that IDIC-DST achieves state-of-the-art performance in few-shot settings on MultiWOZ 2.1 and MultiWOZ 2.4 datasets.
