Table of Contents
Fetching ...

RoNID: New Intent Discovery with Generated-Reliable Labels and Cluster-friendly Representations

Shun Zhang, Chaoran Yan, Jian Yang, Changyu Ren, Jiaqi Bai, Tongliang Li, Zhoujun Li

TL;DR

RoNID tackles open-world New Intent Discovery by coupling reliable pseudo-label generation via an optimal transport (OT) formulation with EM-style optimization and cluster-friendly representation learning through intra- and inter-cluster contrastive objectives. The method iteratively refines pseudo-labels and representations, breaking the negative feedback loop between labeling accuracy and representation quality. Experimental results on three benchmarks show RoNID achieving state-of-the-art performance with robust gains across ACC, NMI, and ARI and strong robustness to varying known-class ratios. This work provides a principled framework for discovering novel intents while preserving known ones, with practical implications for improving open-domain dialogue systems.

Abstract

New Intent Discovery (NID) strives to identify known and reasonably deduce novel intent groups in the open-world scenario. But current methods face issues with inaccurate pseudo-labels and poor representation learning, creating a negative feedback loop that degrades overall model performance, including accuracy and the adjusted rand index. To address the aforementioned challenges, we propose a Robust New Intent Discovery (RoNID) framework optimized by an EM-style method, which focuses on constructing reliable pseudo-labels and obtaining cluster-friendly discriminative representations. RoNID comprises two main modules: reliable pseudo-label generation module and cluster-friendly representation learning module. Specifically, the pseudo-label generation module assigns reliable synthetic labels by solving an optimal transport problem in the E-step, which effectively provides high-quality supervised signals for the input of the cluster-friendly representation learning module. To learn cluster-friendly representation with strong intra-cluster compactness and large inter-cluster separation, the representation learning module combines intra-cluster and inter-cluster contrastive learning in the M-step to feed more discriminative features into the generation module. RoNID can be performed iteratively to ultimately yield a robust model with reliable pseudo-labels and cluster-friendly representations. Experimental results on multiple benchmarks demonstrate our method brings substantial improvements over previous state-of-the-art methods by a large margin of +1~+4 points.

RoNID: New Intent Discovery with Generated-Reliable Labels and Cluster-friendly Representations

TL;DR

RoNID tackles open-world New Intent Discovery by coupling reliable pseudo-label generation via an optimal transport (OT) formulation with EM-style optimization and cluster-friendly representation learning through intra- and inter-cluster contrastive objectives. The method iteratively refines pseudo-labels and representations, breaking the negative feedback loop between labeling accuracy and representation quality. Experimental results on three benchmarks show RoNID achieving state-of-the-art performance with robust gains across ACC, NMI, and ARI and strong robustness to varying known-class ratios. This work provides a principled framework for discovering novel intents while preserving known ones, with practical implications for improving open-domain dialogue systems.

Abstract

New Intent Discovery (NID) strives to identify known and reasonably deduce novel intent groups in the open-world scenario. But current methods face issues with inaccurate pseudo-labels and poor representation learning, creating a negative feedback loop that degrades overall model performance, including accuracy and the adjusted rand index. To address the aforementioned challenges, we propose a Robust New Intent Discovery (RoNID) framework optimized by an EM-style method, which focuses on constructing reliable pseudo-labels and obtaining cluster-friendly discriminative representations. RoNID comprises two main modules: reliable pseudo-label generation module and cluster-friendly representation learning module. Specifically, the pseudo-label generation module assigns reliable synthetic labels by solving an optimal transport problem in the E-step, which effectively provides high-quality supervised signals for the input of the cluster-friendly representation learning module. To learn cluster-friendly representation with strong intra-cluster compactness and large inter-cluster separation, the representation learning module combines intra-cluster and inter-cluster contrastive learning in the M-step to feed more discriminative features into the generation module. RoNID can be performed iteratively to ultimately yield a robust model with reliable pseudo-labels and cluster-friendly representations. Experimental results on multiple benchmarks demonstrate our method brings substantial improvements over previous state-of-the-art methods by a large margin of +1~+4 points.
Paper Structure (32 sections, 10 equations, 5 figures, 3 tables)

This paper contains 32 sections, 10 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Illustration of the defects of baselines and the advantages of our method. Compared to the baselines (a) with the clustering degeneracy, our method (b) successfully separates known and novel intents based on reliable pseudo-labels and cluster-friendly representations. S-RL denotes suboptimal representation learning and CF-RL denotes cluster-friendly representation learning.
  • Figure 2: Overview of our RoNID framework, which obtains reliable pseudo-labels by solving an optimal transport problem in E-step, and learns cluster-friendly representations combining Intra loss, Inter loss, and CE loss in M-step.
  • Figure 3: Accuracy curves of pseudo-labels for three datasets during training. The x-axis represents training epochs, and the y-axis represents the accuracy of the pseudo-labels.
  • Figure 4: The t-SNE visualizations of learned intent representation.
  • Figure 5: The effect of the known class ratio on three datasets. The x-axis represents the ratio of known intent classes, and the y-axis represents the accuracy values.