Table of Contents
Fetching ...

Transforming Slot Schema Induction with Generative Dialogue State Inference

James D. Finch, Boxin Zhao, Jinho D. Choi

TL;DR

GenDSI tackles slot schema induction for task-oriented dialogue by using a generative dialogue state generator to produce slot-value candidates with predicted slot names from unlabeled dialogues. The pipeline encodes candidates with SBERT and clusters them with HDBSCAN to form unified slot schemas, enabling automatic naming of slots. On MultiWOZ and SGD, GenDSI outperforms prior SoTA SSI methods, reducing redundancy and improving recall and value purity while maintaining robustness to unseen domains. This approach enables scalable, domain-agnostic dialogue state representations with minimal manual schema engineering.

Abstract

The challenge of defining a slot schema to represent the state of a task-oriented dialogue system is addressed by Slot Schema Induction (SSI), which aims to automatically induce slots from unlabeled dialogue data. Whereas previous approaches induce slots by clustering value spans extracted directly from the dialogue text, we demonstrate the power of discovering slots using a generative approach. By training a model to generate slot names and values that summarize key dialogue information with no prior task knowledge, our SSI method discovers high-quality candidate information for representing dialogue state. These discovered slot-value candidates can be easily clustered into unified slot schemas that align well with human-authored schemas. Experimental comparisons on the MultiWOZ and SGD datasets demonstrate that Generative Dialogue State Inference (GenDSI) outperforms the previous state-of-the-art on multiple aspects of the SSI task.

Transforming Slot Schema Induction with Generative Dialogue State Inference

TL;DR

GenDSI tackles slot schema induction for task-oriented dialogue by using a generative dialogue state generator to produce slot-value candidates with predicted slot names from unlabeled dialogues. The pipeline encodes candidates with SBERT and clusters them with HDBSCAN to form unified slot schemas, enabling automatic naming of slots. On MultiWOZ and SGD, GenDSI outperforms prior SoTA SSI methods, reducing redundancy and improving recall and value purity while maintaining robustness to unseen domains. This approach enables scalable, domain-agnostic dialogue state representations with minimal manual schema engineering.

Abstract

The challenge of defining a slot schema to represent the state of a task-oriented dialogue system is addressed by Slot Schema Induction (SSI), which aims to automatically induce slots from unlabeled dialogue data. Whereas previous approaches induce slots by clustering value spans extracted directly from the dialogue text, we demonstrate the power of discovering slots using a generative approach. By training a model to generate slot names and values that summarize key dialogue information with no prior task knowledge, our SSI method discovers high-quality candidate information for representing dialogue state. These discovered slot-value candidates can be easily clustered into unified slot schemas that align well with human-authored schemas. Experimental comparisons on the MultiWOZ and SGD datasets demonstrate that Generative Dialogue State Inference (GenDSI) outperforms the previous state-of-the-art on multiple aspects of the SSI task.
Paper Structure (30 sections, 10 equations, 3 figures, 2 tables)

This paper contains 30 sections, 10 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overview of the GenDSI approach.
  • Figure 2: Annotation interface with instructions for human evaluation of Completeness of predicted state updates.
  • Figure 3: Annotation interface with instructions for human evaluation of Correctness of predicted slot-value pairs.