New Intent Discovery with Pre-training and Contrastive Learning
Yuwei Zhang, Haode Zhang, Li-Ming Zhan, Albert Y. S. Lam, Xiao-Ming Wu
TL;DR
This paper addresses new intent discovery (NID), aiming to identify novel intents from unlabeled user utterances to extend predefined intents. It proposes a two-stage framework: Stage 1 multi-task pre-training (MTP) leverages external labeled datasets and internal unlabeled data to learn task-aware utterance representations, and Stage 2 neighborhood-aware contrastive learning (CLNN) uses nearest neighbors to produce compact embeddings suitable for clustering. Empirical results on three benchmarks show that MTP significantly outperforms baselines in both unsupervised and semi-supervised NID, and CLNN provides additional gains, achieving state-of-the-art performance while reducing reliance on domain-specific labels. The approach offers practical value for dialogue systems by enabling effective knowledge transfer with limited labeled data and robust clustering of unknown intents.
Abstract
New intent discovery aims to uncover novel intent categories from user utterances to expand the set of supported intent classes. It is a critical task for the development and service expansion of a practical dialogue system. Despite its importance, this problem remains under-explored in the literature. Existing approaches typically rely on a large amount of labeled utterances and employ pseudo-labeling methods for representation learning and clustering, which are label-intensive, inefficient, and inaccurate. In this paper, we provide new solutions to two important research questions for new intent discovery: (1) how to learn semantic utterance representations and (2) how to better cluster utterances. Particularly, we first propose a multi-task pre-training strategy to leverage rich unlabeled data along with external labeled data for representation learning. Then, we design a new contrastive loss to exploit self-supervisory signals in unlabeled data for clustering. Extensive experiments on three intent recognition benchmarks demonstrate the high effectiveness of our proposed method, which outperforms state-of-the-art methods by a large margin in both unsupervised and semi-supervised scenarios. The source code will be available at https://github.com/zhang-yu-wei/MTP-CLNN.
