Table of Contents
Fetching ...

Building a Conversational Agent Overnight with Dialogue Self-Play

Pararth Shah, Dilek Hakkani-Tür, Gokhan Tür, Abhinav Rastogi, Ankur Bapna, Neha Nayak, Larry Heck

TL;DR

Problem: rapidly bootstrapping end-to-end goal-oriented dialogue agents for new tasks with high-quality training data. The main approach: M2M combines automated dialogue self-play with crowdsourced paraphrase to generate diverse, well-annotated datasets from task schemas and API clients; supports dataset expansion and supervised or RL-based model training. Key contributions: a 3,000-dialogue corpus across two domains, a scalable, domain-agnostic pipeline, and empirical evidence that M2M yields greater diversity and competitive quality versus popular datasets. Significance: enables fast, cost-effective bootstrapping of dialogue systems that can be deployed earlier and improved continuously from real user interaction.

Abstract

We propose Machines Talking To Machines (M2M), a framework combining automation and crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues in arbitrary domains. M2M scales to new tasks with just a task schema and an API client from the dialogue system developer, but it is also customizable to cater to task-specific interactions. Compared to the Wizard-of-Oz approach for data collection, M2M achieves greater diversity and coverage of salient dialogue flows while maintaining the naturalness of individual utterances. In the first phase, a simulated user bot and a domain-agnostic system bot converse to exhaustively generate dialogue "outlines", i.e. sequences of template utterances and their semantic parses. In the second phase, crowd workers provide contextual rewrites of the dialogues to make the utterances more natural while preserving their meaning. The entire process can finish within a few hours. We propose a new corpus of 3,000 dialogues spanning 2 domains collected with M2M, and present comparisons with popular dialogue datasets on the quality and diversity of the surface forms and dialogue flows.

Building a Conversational Agent Overnight with Dialogue Self-Play

TL;DR

Problem: rapidly bootstrapping end-to-end goal-oriented dialogue agents for new tasks with high-quality training data. The main approach: M2M combines automated dialogue self-play with crowdsourced paraphrase to generate diverse, well-annotated datasets from task schemas and API clients; supports dataset expansion and supervised or RL-based model training. Key contributions: a 3,000-dialogue corpus across two domains, a scalable, domain-agnostic pipeline, and empirical evidence that M2M yields greater diversity and competitive quality versus popular datasets. Significance: enables fast, cost-effective bootstrapping of dialogue systems that can be deployed earlier and improved continuously from real user interaction.

Abstract

We propose Machines Talking To Machines (M2M), a framework combining automation and crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues in arbitrary domains. M2M scales to new tasks with just a task schema and an API client from the dialogue system developer, but it is also customizable to cater to task-specific interactions. Compared to the Wizard-of-Oz approach for data collection, M2M achieves greater diversity and coverage of salient dialogue flows while maintaining the naturalness of individual utterances. In the first phase, a simulated user bot and a domain-agnostic system bot converse to exhaustively generate dialogue "outlines", i.e. sequences of template utterances and their semantic parses. In the second phase, crowd workers provide contextual rewrites of the dialogues to make the utterances more natural while preserving their meaning. The entire process can finish within a few hours. We propose a new corpus of 3,000 dialogues spanning 2 domains collected with M2M, and present comparisons with popular dialogue datasets on the quality and diversity of the surface forms and dialogue flows.

Paper Structure

This paper contains 15 sections, 3 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Our proposed M2M framework: (1) the dialogue developer provides a task schema and an API client, (2) automated bots generate dialogue outlines, (3) crowd workers rewrite the utterances and validate slot spans, (4) a dialogue model is trained with supervised learning on the dataset. The whole process can complete in under 8 hours.
  • Figure 2: Example of generating an outline and its paraphrase. See text for details.
  • Figure 3: Contextual rewrite task interface for paraphrasing a dialogue outline with natural language.
  • Figure 4: Dialogue quality evaluation task interface for rating the user and system turns of completed dialogues.