LUCID: LLM-Generated Utterances for Complex and Interesting Dialogues

Joe Stacey; Jianpeng Cheng; John Torr; Tristan Guigue; Joris Driesen; Alexandru Coca; Mark Gaynor; Anders Johannsen

LUCID: LLM-Generated Utterances for Complex and Interesting Dialogues

Joe Stacey, Jianpeng Cheng, John Torr, Tristan Guigue, Joris Driesen, Alexandru Coca, Mark Gaynor, Anders Johannsen

TL;DR

LUCID introduces a scalable, automated pipeline for generating high-quality task-oriented dialogue data using a modular sequence of LLM calls. By separating intent generation, conversation planning, turn-by-turn generation, and validation, it achieves diverse intents (100) across many domains (13) with rich slot structures (501) and a wide range of conversational phenomena. A rigorous validation protocol and a mock back-end ensure labeling reliability, yielding a seed dataset of 4,277 dialogues with low labeling error rates and facilitating both in-distribution and out-of-distribution evaluation. The work demonstrates competitive baseline performance on seen intents and meaningful generalization to unseen intents, and provides open-source tooling to enable large-scale, automated data generation for new domains and targets in dialogue systems.

Abstract

Spurred by recent advances in Large Language Models (LLMs), virtual assistants are poised to take a leap forward in terms of their dialogue capabilities. Yet a major bottleneck to achieving genuinely transformative task-oriented dialogue capabilities remains the scarcity of high quality data. Existing datasets, while impressive in scale, have limited domain coverage and contain few genuinely challenging conversational phenomena; those which are present are typically unlabelled, making it difficult to assess the strengths and weaknesses of models without time-consuming and costly human evaluation. Moreover, creating high quality dialogue data has until now required considerable human input, limiting both the scale of these datasets and the ability to rapidly bootstrap data for a new target domain. We aim to overcome these issues with LUCID, a modularised and highly automated LLM-driven data generation system that produces realistic, diverse and challenging dialogues. We use LUCID to generate a seed dataset of 4,277 conversations across 100 intents to demonstrate its capabilities, with a human review finding consistently high quality labels in the generated data.

LUCID: LLM-Generated Utterances for Complex and Interesting Dialogues

TL;DR

Abstract

Paper Structure (29 sections, 7 figures, 10 tables)

This paper contains 29 sections, 7 figures, 10 tables.

Introduction
Related Work
Task Oriented Dialogue Datasets
Automated Data Collection Methods
Method
Intent Generation
Conversation Planner
Generating Slot Values
Generating Conversations
LLM-based Validation
Introducing Additional Conversational Phenomena
Annotation Scheme and our Mock Back-end
Analysis
Diversity of Slots and Intents
Conversational Phenomena
...and 14 more sections

Figures (7)

Figure 1: An extract of a LUCID conversation containing a challenging phenomenon. In this case, the second user response is most likely to be from an overheard conversation rather than providing the desired slot value.
Figure 2: The stages in the LUCID data generation, generating intents (stages 1-2), planning conversations (stages 3-8), generating the conversations (stages 9-12) and validating the system predictions (stages 13-14).
Figure 3: A (simplified) example labelled conversation. Each dialogue contains user, system, signal and response turns.
Figure 4: Examples for eight of the nine challenging conversational phenomena included in the LUCID dataset. We also included 'cancellation' examples which are similar to 'delay confirmation', resulting in the system not confirming a given intent.
Figure 5: An example conversation from LUCID (Example #1). As described in \ref{['sec:examples']}, we show the first three LUCID conversations to provide an unbiased sample of our generated data.
...and 2 more figures

LUCID: LLM-Generated Utterances for Complex and Interesting Dialogues

TL;DR

Abstract

LUCID: LLM-Generated Utterances for Complex and Interesting Dialogues

Authors

TL;DR

Abstract

Table of Contents

Figures (7)