Exploring Description-Augmented Dataless Intent Classification
Ruoyu Hu, Foaad Khosmood, Abbas Edalat
TL;DR
This work advances dataless intent classification by introducing intent label descriptions, utterance paraphrasing, and masking-based entity handling to create robust class prototypes from embedding models. By combining declarative, description-based intent representations with inference-time paraphrasing and entity-aware masking, the approach achieves significant improvements over tokenized-label baselines and strong zero-shot baselines across four TODS datasets, while reducing model-variance. The methodology is evaluated on diverse SOTA embedding families, illustrating both the benefits and limitations of description-augmented prototypes, particularly in single-domain or highly imbalanced datasets like ATIS. The results highlight the potential for scalable, data-efficient intent classification in dynamic task-oriented systems, and provide qualitative analyses and ablations to guide future research on description quality and entity-disambiguation strategies.
Abstract
In this work, we introduce several schemes to leverage description-augmented embedding similarity for dataless intent classification using current state-of-the-art (SOTA) text embedding models. We report results of our methods on four commonly used intent classification datasets and compare against previous works of a similar nature. Our work shows promising results for dataless classification scaling to a large number of unseen intents. We show competitive results and significant improvements (+6.12\% Avg.) over strong zero-shot baselines, all without training on labelled or task-specific data. Furthermore, we provide qualitative error analysis of the shortfalls of this methodology to help guide future research in this area.
