Table of Contents
Fetching ...

Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset

Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan

TL;DR

The paper tackles the challenge of building scalable, multi-domain virtual assistants by introducing the Schema-Guided Dialogue (SGD) dataset and a schema-guided paradigm that treats service schemas as inputs to a single, unified dialogue model. It presents a large-scale dataset (16k dialogues across 16 domains and 26 services) generated via a dialogue simulator and paraphrasing pipeline, augmented with zero-shot evaluation to test generalization to unseen APIs. A BERT-based schema embedding approach enables predictions over dynamic intents and slots, supporting zero-shot dialogue state tracking and robust handling of API changes. The work demonstrates competitive performance on SGD and existing datasets, arguing that schema-aware, data-efficient, and easily extensible frameworks are crucial for real-world, scalable virtual assistants.

Abstract

Virtual assistants such as Google Assistant, Alexa and Siri provide a conversational interface to a large number of services and APIs spanning multiple domains. Such systems need to support an ever-increasing number of services with possibly overlapping functionality. Furthermore, some of these services have little to no training data available. Existing public datasets for task-oriented dialogue do not sufficiently capture these challenges since they cover few domains and assume a single static ontology per domain. In this work, we introduce the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains. Our dataset exceeds the existing task-oriented dialogue corpora in scale, while also highlighting the challenges associated with building large-scale virtual assistants. It provides a challenging testbed for a number of tasks including language understanding, slot filling, dialogue state tracking and response generation. Along the same lines, we present a schema-guided paradigm for task-oriented dialogue, in which predictions are made over a dynamic set of intents and slots, provided as input, using their natural language descriptions. This allows a single dialogue system to easily support a large number of services and facilitates simple integration of new services without requiring additional training data. Building upon the proposed paradigm, we release a model for dialogue state tracking capable of zero-shot generalization to new APIs, while remaining competitive in the regular setting.

Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset

TL;DR

The paper tackles the challenge of building scalable, multi-domain virtual assistants by introducing the Schema-Guided Dialogue (SGD) dataset and a schema-guided paradigm that treats service schemas as inputs to a single, unified dialogue model. It presents a large-scale dataset (16k dialogues across 16 domains and 26 services) generated via a dialogue simulator and paraphrasing pipeline, augmented with zero-shot evaluation to test generalization to unseen APIs. A BERT-based schema embedding approach enables predictions over dynamic intents and slots, supporting zero-shot dialogue state tracking and robust handling of API changes. The work demonstrates competitive performance on SGD and existing datasets, arguing that schema-aware, data-efficient, and easily extensible frameworks are crucial for real-world, scalable virtual assistants.

Abstract

Virtual assistants such as Google Assistant, Alexa and Siri provide a conversational interface to a large number of services and APIs spanning multiple domains. Such systems need to support an ever-increasing number of services with possibly overlapping functionality. Furthermore, some of these services have little to no training data available. Existing public datasets for task-oriented dialogue do not sufficiently capture these challenges since they cover few domains and assume a single static ontology per domain. In this work, we introduce the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains. Our dataset exceeds the existing task-oriented dialogue corpora in scale, while also highlighting the challenges associated with building large-scale virtual assistants. It provides a challenging testbed for a number of tasks including language understanding, slot filling, dialogue state tracking and response generation. Along the same lines, we present a schema-guided paradigm for task-oriented dialogue, in which predictions are made over a dynamic set of intents and slots, provided as input, using their natural language descriptions. This allows a single dialogue system to easily support a large number of services and facilitates simple integration of new services without requiring additional training data. Building upon the proposed paradigm, we release a model for dialogue state tracking capable of zero-shot generalization to new APIs, while remaining competitive in the regular setting.

Paper Structure

This paper contains 24 sections, 5 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: Example schema for a digital wallet service.
  • Figure 2: The overall architecture of the dialogue simulation framework for generating dialogue outlines.
  • Figure 3: Steps for obtaining paraphrased conversations. To increase the presence of relative dates like tomorrow, next Monday, the current date is assumed to be March 1, 2019.
  • Figure 4: Detailed statistics of the SGD dataset.
  • Figure 5: The predicted dialogue state (shown with dashed edges) for the first two user turns for an example dialogue, showing the active intent and slot assignments, with two related annotation schemas. Note that the dialogue state representation is conditioned on the schema under consideration, which is provided as input, as are the user and system utterances.
  • ...and 7 more figures