Table of Contents
Fetching ...

TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles

Yinhong Liu, Yimai Fang, David Vandyke, Nigel Collier

TL;DR

TOAD addresses data scarcity in Task-Oriented Dialog by introducing an automatic, LLM-driven data generation pipeline that simulates realistic app-context interactions and offers diverse system response styles. The three-stage pipeline—persona-grounded context generation, plot generation, and dialog utterance realization—coupled with a pseudocode Meaning Representation and style-controlled utterance realization, enables scalable, multi-style TOD data. The study analyzes two response dimensions, verbosity and mirroring, and presents benchmarks showing that verbose or non-mirroring styles are more challenging, while access to dialog history improves performance. Overall, TOAD provides a scalable dataset and generation framework that supportsadaptive, natural TOD systems and offers actionable insights into how response styles affect model performance and user experience.

Abstract

In light of recent advances in large language models (LLMs), the expectations for the next generation of virtual assistants include enhanced naturalness and adaptability across diverse usage scenarios. However, the creation of high-quality annotated data for Task-Oriented Dialog (TOD) is recognized to be slow and costly. To address these challenges, we introduce Task-Oriented Automatic Dialogs (TOAD), a novel and scalable TOD dataset along with its automatic generation pipeline. The TOAD dataset simulates realistic app context interaction and provide a variety of system response style options. Two aspects of system response styles are considered, verbosity level and users' expression mirroring. We benchmark TOAD on two response generation tasks, and the results show that modeling more verbose responses or responses without user expression mirroring is more challenging.

TOAD: Task-Oriented Automatic Dialogs with Diverse Response Styles

TL;DR

TOAD addresses data scarcity in Task-Oriented Dialog by introducing an automatic, LLM-driven data generation pipeline that simulates realistic app-context interactions and offers diverse system response styles. The three-stage pipeline—persona-grounded context generation, plot generation, and dialog utterance realization—coupled with a pseudocode Meaning Representation and style-controlled utterance realization, enables scalable, multi-style TOD data. The study analyzes two response dimensions, verbosity and mirroring, and presents benchmarks showing that verbose or non-mirroring styles are more challenging, while access to dialog history improves performance. Overall, TOAD provides a scalable dataset and generation framework that supportsadaptive, natural TOD systems and offers actionable insights into how response styles affect model performance and user experience.

Abstract

In light of recent advances in large language models (LLMs), the expectations for the next generation of virtual assistants include enhanced naturalness and adaptability across diverse usage scenarios. However, the creation of high-quality annotated data for Task-Oriented Dialog (TOD) is recognized to be slow and costly. To address these challenges, we introduce Task-Oriented Automatic Dialogs (TOAD), a novel and scalable TOD dataset along with its automatic generation pipeline. The TOAD dataset simulates realistic app context interaction and provide a variety of system response style options. Two aspects of system response styles are considered, verbosity level and users' expression mirroring. We benchmark TOAD on two response generation tasks, and the results show that modeling more verbose responses or responses without user expression mirroring is more challenging.
Paper Structure (39 sections, 4 figures, 8 tables)

This paper contains 39 sections, 4 figures, 8 tables.

Figures (4)

  • Figure 1: A dialog example from TOAD, with all system response styles. LV, MV, HV and M stand for Low, Mid, High Verbosity and Mirroring. The underscored responses are selected as the default styles.
  • Figure 2: Overview of the TOAD Automatic Generation Pipeline in 3 Steps: (i) Persona-grounded user device context generation, (ii) Action plot generation, and (iii) Dialog utterance realization.
  • Figure 3: Plot construction for single intent dialog based on slot-filling strategy.
  • Figure 4: (i) Distribution of the dialog lengths. (ii) Distribution of the word count in dialog utterance. (iii) Service Coverage Distribution in Dialogs. (iv) Distribution of plot actions (Actions with $>3\%$ inclusion). Note: All distributions are flattened, e.g. each service or action in multi-intent dialogs individually