Table of Contents
Fetching ...

Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

Chris Samarinas, Pracha Promthaw, Atharva Nijasure, Hansi Zeng, Julian Killingback, Hamed Zamani

TL;DR

TOD systems require large, diverse training data, and single-prompt synthetic data often under-covers task variation. SynTOD addresses this by using a state-transition graph to guide graph-guided multi-prompt LLM generation and retrieval augmentation to produce end-to-end TOD data. The approach yields substantial improvements in intent classification, slot filling, and response relevance across cooking and e-commerce domains, while analyzing model size, data efficiency, and evaluation alignment with humans. The work also provides synthetic datasets, models, and code to support rapid development of domain-specific TOD systems in low-resource settings.

Abstract

This paper explores SynTOD, a new synthetic data generation approach for developing end-to-end Task-Oriented Dialogue (TOD) Systems capable of handling complex tasks such as intent classification, slot filling, conversational question-answering, and retrieval-augmented response generation, without relying on crowdsourcing or real-world data. SynTOD utilizes a state transition graph to define the desired behavior of a TOD system and generates diverse, structured conversations through random walks and response simulation using large language models (LLMs). In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance compared to naive single-prompt simulated conversations. We also investigate the end-to-end TOD effectiveness of different base and instruction-tuned LLMs, with and without the constructed synthetic conversations. Finally, we explore how various LLMs can evaluate responses in a TOD system and how well they are correlated with human judgments. Our findings pave the path towards quick development and evaluation of domain-specific TOD systems. We release our datasets, models, and code for research purposes.

Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

TL;DR

TOD systems require large, diverse training data, and single-prompt synthetic data often under-covers task variation. SynTOD addresses this by using a state-transition graph to guide graph-guided multi-prompt LLM generation and retrieval augmentation to produce end-to-end TOD data. The approach yields substantial improvements in intent classification, slot filling, and response relevance across cooking and e-commerce domains, while analyzing model size, data efficiency, and evaluation alignment with humans. The work also provides synthetic datasets, models, and code to support rapid development of domain-specific TOD systems in low-resource settings.

Abstract

This paper explores SynTOD, a new synthetic data generation approach for developing end-to-end Task-Oriented Dialogue (TOD) Systems capable of handling complex tasks such as intent classification, slot filling, conversational question-answering, and retrieval-augmented response generation, without relying on crowdsourcing or real-world data. SynTOD utilizes a state transition graph to define the desired behavior of a TOD system and generates diverse, structured conversations through random walks and response simulation using large language models (LLMs). In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance compared to naive single-prompt simulated conversations. We also investigate the end-to-end TOD effectiveness of different base and instruction-tuned LLMs, with and without the constructed synthetic conversations. Finally, we explore how various LLMs can evaluate responses in a TOD system and how well they are correlated with human judgments. Our findings pave the path towards quick development and evaluation of domain-specific TOD systems. We release our datasets, models, and code for research purposes.
Paper Structure (9 sections, 7 figures, 11 tables)

This paper contains 9 sections, 7 figures, 11 tables.

Figures (7)

  • Figure 1: Overview of an end-to-end retrieval-augmented TOD system. A LLM and a retriever are the main components. A conversation history is given as input, and response, intent, slots and documents comprise the output system state.
  • Figure 2: The state transition graph we defined for the recipe assistant domain. On the right we see transitions to nodes that are possible from any other state.
  • Figure 2: Diversity of data generated with and without a graph.
  • Figure 3: Overview of the SynTOD conversation simulation framework.
  • Figure 4: Frequency distribution of user intents for the recipe domain with and without state transition graph.
  • ...and 2 more figures