Transforming Slot Schema Induction with Generative Dialogue State Inference
James D. Finch, Boxin Zhao, Jinho D. Choi
TL;DR
GenDSI tackles slot schema induction for task-oriented dialogue by using a generative dialogue state generator to produce slot-value candidates with predicted slot names from unlabeled dialogues. The pipeline encodes candidates with SBERT and clusters them with HDBSCAN to form unified slot schemas, enabling automatic naming of slots. On MultiWOZ and SGD, GenDSI outperforms prior SoTA SSI methods, reducing redundancy and improving recall and value purity while maintaining robustness to unseen domains. This approach enables scalable, domain-agnostic dialogue state representations with minimal manual schema engineering.
Abstract
The challenge of defining a slot schema to represent the state of a task-oriented dialogue system is addressed by Slot Schema Induction (SSI), which aims to automatically induce slots from unlabeled dialogue data. Whereas previous approaches induce slots by clustering value spans extracted directly from the dialogue text, we demonstrate the power of discovering slots using a generative approach. By training a model to generate slot names and values that summarize key dialogue information with no prior task knowledge, our SSI method discovers high-quality candidate information for representing dialogue state. These discovered slot-value candidates can be easily clustered into unified slot schemas that align well with human-authored schemas. Experimental comparisons on the MultiWOZ and SGD datasets demonstrate that Generative Dialogue State Inference (GenDSI) outperforms the previous state-of-the-art on multiple aspects of the SSI task.
