Table of Contents
Fetching ...

SynthDST: Synthetic Data is All You Need for Few-Shot Dialog State Tracking

Atharva Kulkarni, Bo-Hsiang Tseng, Joel Ruben Antony Moniz, Dhivya Piraviperumal, Hong Yu, Shruti Bhargava

TL;DR

Dialog State Tracking (DST) often relies on costly labeled data. SynthDST generates synthetic DST dialogues directly from a dialogue schema using an abstract dialogue model, domain-agnostic templates, and LLM-based paraphrase to produce annotated conversations. It achieves four to five percentage-point gains over zero-shot prompting on MultiWOZ 2.1 and 2.4 and reaches about 98% and 95% of the training-data performance, respectively, when using synthetic data for few-shot prompting. The approach enables effective retrieval-based few-shot ICL with substantially lower annotation costs (full dataset under $40) and supports rapid domain adaptation without extensive human labeling.

Abstract

In-context learning with Large Language Models (LLMs) has emerged as a promising avenue of research in Dialog State Tracking (DST). However, the best-performing in-context learning methods involve retrieving and adding similar examples to the prompt, requiring access to labeled training data. Procuring such training data for a wide range of domains and applications is time-consuming, expensive, and, at times, infeasible. While zero-shot learning requires no training data, it significantly lags behind the few-shot setup. Thus, `\textit{Can we efficiently generate synthetic data for any dialogue schema to enable few-shot prompting?}' Addressing this question, we propose \method, a data generation framework tailored for DST, utilizing LLMs. Our approach only requires the dialogue schema and a few hand-crafted dialogue templates to synthesize natural, coherent, and free-flowing dialogues with DST annotations. Few-shot learning using data from {\method} results in $4-5%$ improvement in Joint Goal Accuracy over the zero-shot baseline on MultiWOZ 2.1 and 2.4. Remarkably, our few-shot learning approach recovers nearly $98%$ of the performance compared to the few-shot setup using human-annotated training data. Our synthetic data and code can be accessed at https://github.com/apple/ml-synthdst

SynthDST: Synthetic Data is All You Need for Few-Shot Dialog State Tracking

TL;DR

Dialog State Tracking (DST) often relies on costly labeled data. SynthDST generates synthetic DST dialogues directly from a dialogue schema using an abstract dialogue model, domain-agnostic templates, and LLM-based paraphrase to produce annotated conversations. It achieves four to five percentage-point gains over zero-shot prompting on MultiWOZ 2.1 and 2.4 and reaches about 98% and 95% of the training-data performance, respectively, when using synthetic data for few-shot prompting. The approach enables effective retrieval-based few-shot ICL with substantially lower annotation costs (full dataset under $40) and supports rapid domain adaptation without extensive human labeling.

Abstract

In-context learning with Large Language Models (LLMs) has emerged as a promising avenue of research in Dialog State Tracking (DST). However, the best-performing in-context learning methods involve retrieving and adding similar examples to the prompt, requiring access to labeled training data. Procuring such training data for a wide range of domains and applications is time-consuming, expensive, and, at times, infeasible. While zero-shot learning requires no training data, it significantly lags behind the few-shot setup. Thus, `\textit{Can we efficiently generate synthetic data for any dialogue schema to enable few-shot prompting?}' Addressing this question, we propose \method, a data generation framework tailored for DST, utilizing LLMs. Our approach only requires the dialogue schema and a few hand-crafted dialogue templates to synthesize natural, coherent, and free-flowing dialogues with DST annotations. Few-shot learning using data from {\method} results in improvement in Joint Goal Accuracy over the zero-shot baseline on MultiWOZ 2.1 and 2.4. Remarkably, our few-shot learning approach recovers nearly of the performance compared to the few-shot setup using human-annotated training data. Our synthetic data and code can be accessed at https://github.com/apple/ml-synthdst
Paper Structure (37 sections, 4 figures, 8 tables)

This paper contains 37 sections, 4 figures, 8 tables.

Figures (4)

  • Figure 1: One of these is a dialog generated by SynthDST. Each dialog contains conversation history (as accumulated dialog states), system turn, user turn, and current turn's dialog states. Can you guess which dialog is synthetically generated by SynthDST?.
  • Figure 2: Overall pipeline of SynthDST for synthetic dialog generation
  • Figure 3: Box plot of Human evaluation scores.
  • Figure 4: Domain distribution for MultiWoZ test data.