Table of Contents
Fetching ...

Pseudo-Label Enhanced Prototypical Contrastive Learning for Uniformed Intent Discovery

Yimin Deng, Yuxia Wu, Guoshuai Zhao, Li Zhu, Xueming Qian

TL;DR

A Pseudo-Label enhanced Prototypical Contrastive Learning (PLPCL) model for uniformed intent discovery is proposed and a prototype learning method integrating the supervised and pseudo signals from IND and OOD samples is designed.

Abstract

New intent discovery is a crucial capability for task-oriented dialogue systems. Existing methods focus on transferring in-domain (IND) prior knowledge to out-of-domain (OOD) data through pre-training and clustering stages. They either handle the two processes in a pipeline manner, which exhibits a gap between intent representation and clustering process or use typical contrastive clustering that overlooks the potential supervised signals from the whole data. Besides, they often individually deal with open intent discovery or OOD settings. To this end, we propose a Pseudo-Label enhanced Prototypical Contrastive Learning (PLPCL) model for uniformed intent discovery. We iteratively utilize pseudo-labels to explore potential positive/negative samples for contrastive learning and bridge the gap between representation and clustering. To enable better knowledge transfer, we design a prototype learning method integrating the supervised and pseudo signals from IND and OOD samples. In addition, our method has been proven effective in two different settings of discovering new intents. Experiments on three benchmark datasets and two task settings demonstrate the effectiveness of our approach.

Pseudo-Label Enhanced Prototypical Contrastive Learning for Uniformed Intent Discovery

TL;DR

A Pseudo-Label enhanced Prototypical Contrastive Learning (PLPCL) model for uniformed intent discovery is proposed and a prototype learning method integrating the supervised and pseudo signals from IND and OOD samples is designed.

Abstract

New intent discovery is a crucial capability for task-oriented dialogue systems. Existing methods focus on transferring in-domain (IND) prior knowledge to out-of-domain (OOD) data through pre-training and clustering stages. They either handle the two processes in a pipeline manner, which exhibits a gap between intent representation and clustering process or use typical contrastive clustering that overlooks the potential supervised signals from the whole data. Besides, they often individually deal with open intent discovery or OOD settings. To this end, we propose a Pseudo-Label enhanced Prototypical Contrastive Learning (PLPCL) model for uniformed intent discovery. We iteratively utilize pseudo-labels to explore potential positive/negative samples for contrastive learning and bridge the gap between representation and clustering. To enable better knowledge transfer, we design a prototype learning method integrating the supervised and pseudo signals from IND and OOD samples. In addition, our method has been proven effective in two different settings of discovering new intents. Experiments on three benchmark datasets and two task settings demonstrate the effectiveness of our approach.

Paper Structure

This paper contains 30 sections, 7 equations, 7 figures, 12 tables.

Figures (7)

  • Figure 1: Two basic task settings for uniformed intent discovery. Open-setting: Partially labeled IND data is used for training, and the test data includes both IND and OOD categories. OOD-setting: Fully labeled IND data is used for training, while the test data contains only OOD categories.
  • Figure 2: The overall architecture of the proposed PLPCL. (a) Intent representation is achieved based on disentangled instance-level and cluster-level heads. (b) Stage 1 involves supervised contrastive learning on instance-level representation and classification on cluster-level representation. (c) Stage 2 starts from pseudo-label selecting for unlabeled data followed by semi-supervised and prototypical contrastive learning in an iterative manner.
  • Figure 3: Influence of the labeled ratio on BANKING-open.
  • Figure 4: Influence of the supervisory loss weight on BANKING dataset
  • Figure 5: Influence of the labeled ratio and known cluster ratio on BANKING-open.
  • ...and 2 more figures