Table of Contents
Fetching ...

Diverse and Effective Synthetic Data Generation for Adaptable Zero-Shot Dialogue State Tracking

James D. Finch, Jinho D. Choi

TL;DR

The paper tackles the domain-diversity bottleneck in zero-shot dialogue state tracking by introducing a fully automatic data generation pipeline that leverages instruction-tuned large language models to create the D0T dataset with 1,003+ domains and silver-state annotations. The four-stage pipeline—Scenario Derivation, Dialogue Generation, State Annotation, and Slot Description Generation—produces a large, diverse corpus of synthetic dialogues with domain-wide coverage and slot descriptions, enabling effective cross-domain transfer. Empirical results on MultiWOZ show that pretraining on D0T yields substantial joint goal accuracy gains (e.g., +8.6 for T5-11B and +6.7 for Llama2-13B) and that gains stack with in-context demonstrations, approaching state-of-the-art performance with far fewer parameters. The work demonstrates that domain diversity, enabled by automatic synthetic data generation, can significantly improve zero-shot DST and suggests a scalable path for building robust task-oriented dialogue systems; the authors also release models, code, and data publicly.

Abstract

We demonstrate substantial performance gains in zero-shot dialogue state tracking (DST) by enhancing training data diversity through synthetic data generation. Existing DST datasets are severely limited in the number of application domains and slot types they cover due to the high costs of data collection, restricting their adaptability to new domains. This work addresses this challenge with a novel, fully automatic data generation approach that creates synthetic zero-shot DST datasets. Distinguished from previous methods, our approach can generate dialogues across a massive range of application domains, complete with silver-standard dialogue state annotations and slot descriptions. This technique is used to create the D0T dataset for training zero-shot DST models, encompassing an unprecedented 1,000+ domains. Experiments on the MultiWOZ benchmark show that training models on diverse synthetic data improves Joint Goal Accuracy by 6.7%, achieving results competitive with models 13.5 times larger than ours.

Diverse and Effective Synthetic Data Generation for Adaptable Zero-Shot Dialogue State Tracking

TL;DR

The paper tackles the domain-diversity bottleneck in zero-shot dialogue state tracking by introducing a fully automatic data generation pipeline that leverages instruction-tuned large language models to create the D0T dataset with 1,003+ domains and silver-state annotations. The four-stage pipeline—Scenario Derivation, Dialogue Generation, State Annotation, and Slot Description Generation—produces a large, diverse corpus of synthetic dialogues with domain-wide coverage and slot descriptions, enabling effective cross-domain transfer. Empirical results on MultiWOZ show that pretraining on D0T yields substantial joint goal accuracy gains (e.g., +8.6 for T5-11B and +6.7 for Llama2-13B) and that gains stack with in-context demonstrations, approaching state-of-the-art performance with far fewer parameters. The work demonstrates that domain diversity, enabled by automatic synthetic data generation, can significantly improve zero-shot DST and suggests a scalable path for building robust task-oriented dialogue systems; the authors also release models, code, and data publicly.

Abstract

We demonstrate substantial performance gains in zero-shot dialogue state tracking (DST) by enhancing training data diversity through synthetic data generation. Existing DST datasets are severely limited in the number of application domains and slot types they cover due to the high costs of data collection, restricting their adaptability to new domains. This work addresses this challenge with a novel, fully automatic data generation approach that creates synthetic zero-shot DST datasets. Distinguished from previous methods, our approach can generate dialogues across a massive range of application domains, complete with silver-standard dialogue state annotations and slot descriptions. This technique is used to create the D0T dataset for training zero-shot DST models, encompassing an unprecedented 1,000+ domains. Experiments on the MultiWOZ benchmark show that training models on diverse synthetic data improves Joint Goal Accuracy by 6.7%, achieving results competitive with models 13.5 times larger than ours.
Paper Structure (40 sections, 1 equation, 11 figures, 8 tables, 1 algorithm)

This paper contains 40 sections, 1 equation, 11 figures, 8 tables, 1 algorithm.

Figures (11)

  • Figure 1: The four-stage DST data generation pipeline.
  • Figure 2: Example turn outputs from the automatic state annotation component of the DST data generation pipeline.
  • Figure 3: An example of an input token sequence from the D0T dataset used for training. [YELLOW]: dialogue context $D_{1..t}$ [PEACH]: slot $s^t_i$ [GREEN]: slot description $d^t_i$ [RED]: value examples $e^t_i$ [BLUE]: In-context demonstrations (+ICL only)
  • Figure 4: GPT-3.5 prompt for generating dialogue scenarios/domains.
  • Figure 5: GPT-3.5 prompt for generating a list of information types for each dialogue domain.
  • ...and 6 more figures