New Intent Discovery with Attracting and Dispersing Prototype

Shun Zhang; Jian Yang; Jiaqi Bai; Chaoran Yan; Tongliang Li; Zhao Yan; Zhoujun Li

New Intent Discovery with Attracting and Dispersing Prototype

Shun Zhang, Jian Yang, Jiaqi Bai, Chaoran Yan, Tongliang Li, Zhao Yan, Zhoujun Li

TL;DR

New Intent Discovery (NID) seeks to recognize known intents while inferring novel categories from limited labeled and abundant unlabeled data. The paper introduces Robust and Adaptive Prototypical learning (RAP), which combines Robust Prototypical Attracting Learning (RPAL) to tighten within-cluster compactness and Adaptive Prototypical Dispersing Learning (APDL) to expand between-cluster separation, all under a multitask objective with dynamic prototypes. Using a BERT-based encoder, k-means prototype generation, an interpolation training strategy, and an EMA-based prototype update, RAP achieves state-of-the-art results on CLINC, BANKING, and StackOverflow, with an average improvement of about 5.5% over prior methods and competitive performance against large language models under limited supervision. The work demonstrates that explicitly balancing intra- and inter-cluster distances yields robust, cluster-friendly representations suitable for semi-supervised NID and open-world scenarios, with practical implications for scalable intent discovery in real-world dialogue systems.

Abstract

New Intent Discovery (NID) aims to recognize known and infer new intent categories with the help of limited labeled and large-scale unlabeled data. The task is addressed as a feature-clustering problem and recent studies augment instance representation. However, existing methods fail to capture cluster-friendly representations, since they show less capability to effectively control and coordinate within-cluster and between-cluster distances. Tailored to the NID problem, we propose a Robust and Adaptive Prototypical learning (RAP) framework for globally distinct decision boundaries for both known and new intent categories. Specifically, a robust prototypical attracting learning (RPAL) method is designed to compel instances to gravitate toward their corresponding prototype, achieving greater within-cluster compactness. To attain larger between-cluster separation, another adaptive prototypical dispersing learning (APDL) method is devised to maximize the between-cluster distance from the prototype-to-prototype perspective. Experimental results evaluated on three challenging benchmarks (CLINC, BANKING, and StackOverflow) of our method with better cluster-friendly representation demonstrate that RAP brings in substantial improvements over the current state-of-the-art methods (even large language model) by a large margin (average +5.5% improvement).

New Intent Discovery with Attracting and Dispersing Prototype

TL;DR

Abstract

Paper Structure (29 sections, 14 equations, 6 figures, 6 tables)

This paper contains 29 sections, 14 equations, 6 figures, 6 tables.

Introduction
Related Work
New Intent Discovery
Prototypical Learning
Approach
Problem Definition
Intent Representation Learning
Categorical Prototypes Generation
Robust Prototypical Attracting
Adaptive Prototypical Dispersing
Dynamic Prototypes Update
Multitask Learning
Experiments
Datasets
Baselines
...and 14 more sections

Figures (6)

Figure 1: Embedding distribution of intent instances and the prototype of each class in a shared sphere semantic space. The circle and star shape denote the instance and the prototype, respectively. The discriminative representations fail to be extracted due to insufficient (a) within-cluster compactness and (b) between-cluster separation.
Figure 2: Overview of RAP. Our method is jointly optimized by $L_{r}$, $L_{a}$, and $L_{ce}$. $L_{r}$ mitigates the effects of noisy pseudo-labels while minimizes the instance-to-prototype distance, while $L_{a}$ maximizes the prototype-to-prototype distance. $L_{ce}$ is a cross-entropy loss to prevent knowledge forgetting.
Figure 3: t-SNE visualization of learned representation.
Figure 4: Sensitivity of the models to the number of initial clusters on three datasets.
Figure 5: Impact of varying the known class ratio on two datasets. The x-axis represents different models and the y-axis denotes their corresponding accuracy values.
...and 1 more figures

New Intent Discovery with Attracting and Dispersing Prototype

TL;DR

Abstract

New Intent Discovery with Attracting and Dispersing Prototype

Authors

TL;DR

Abstract

Table of Contents

Figures (6)