Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation

Cheng Niu; Xingguang Wang; Xuxin Cheng; Juntong Song; Tong Zhang

Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation

Cheng Niu, Xingguang Wang, Xuxin Cheng, Juntong Song, Tong Zhang

TL;DR

The paper tackles the high cost of annotated dialogue state tracking data by introducing LUAS, a framework that uses GPT-4 to simulate user–agent conversations and generate large labeled DST corpora. It combines this synthetic data with real data through a two-stage fine-tuning of LLaMA 2, achieving superior DST performance on MultiWOZ 2.2 and 2.4 and enabling rapid adaptation to new domains via domain substitution. Empirical results show notable gains when synthetic data is added, with larger benefits when real data are scarce, and demonstrate the method’s robustness to domain shifts while maintaining reasonable performance. The approach offers a practical, scalable pathway to extend task-oriented dialogue systems across domains and could extend to related dialogue tasks.

Abstract

Dialogue State Tracking (DST) is designed to monitor the evolving dialogue state in the conversations and plays a pivotal role in developing task-oriented dialogue systems. However, obtaining the annotated data for the DST task is usually a costly endeavor. In this paper, we focus on employing LLMs to generate dialogue data to reduce dialogue collection and annotation costs. Specifically, GPT-4 is used to simulate the user and agent interaction, generating thousands of dialogues annotated with DST labels. Then a two-stage fine-tuning on LLaMA 2 is performed on the generated data and the real data for the DST prediction. Experimental results on two public DST benchmarks show that with the generated dialogue data, our model performs better than the baseline trained solely on real data. In addition, our approach is also capable of adapting to the dynamic demands in real-world scenarios, generating dialogues in new domains swiftly. After replacing dialogue segments in any domain with the corresponding generated ones, the model achieves comparable performance to the model trained on real data.

Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation

TL;DR

Abstract

Paper Structure (26 sections, 1 equation, 2 figures, 19 tables)

This paper contains 26 sections, 1 equation, 2 figures, 19 tables.

Introduction
Related Work
Dialogue State Tracking
Data Augmentation by LLMs
Method
Problem Definition
Using LLaMA 2 to Predict Dialogue State
User-Agent Dialogue Simulation backed by GPT-4
Simulation Process Overview
User/Agent Intentions
Simulation Details
Slot Extraction
Generation Diversity
Two-stage Fine-tuning Strategy
Experiments
...and 11 more sections

Figures (2)

Figure 1: The simulation process of our approach. The blue boxes are intentions for the user and the agent, the '[RECOM]', '[EOF]', and '[EOD]' are control identifiers.
Figure 2: The error distribution between $\text{LUAS}_\text{R}$ and $\text{LUAS}_\text{R+G}$ with different sizes of real data on MultiWOZ 2.2.

Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation

TL;DR

Abstract

Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)