STAR: A Schema-Guided Dialog Dataset for Transfer Learning
Johannes E. M. Mosig, Shikib Mehri, Thomas Kober
TL;DR
STAR introduces a schema-guided, transfer-focused dialog dataset designed to enable robust zero-shot generalization across tasks and domains in task-oriented dialog. It pairs explicit, graph-based task schemas with a scalable crowd-sourcing pipeline and schema-conditioned models for next-action prediction and response generation. The authors show that schema guidance can improve multi-task transfer and generation quality, while highlighting challenges in seen-task action prediction and zero-shot gaps. Overall, STAR provides a valuable benchmark and methodological framework for evaluating and advancing schema-based transfer learning in conversational AI.
Abstract
We present STAR, a schema-guided task-oriented dialog dataset consisting of 127,833 utterances and knowledge base queries across 5,820 task-oriented dialogs in 13 domains that is especially designed to facilitate task and domain transfer learning in task-oriented dialog. Furthermore, we propose a scalable crowd-sourcing paradigm to collect arbitrarily large datasets of the same quality as STAR. Moreover, we introduce novel schema-guided dialog models that use an explicit description of the task(s) to generalize from known to unknown tasks. We demonstrate the effectiveness of these models, particularly for zero-shot generalization across tasks and domains.
