Table of Contents
Fetching ...

A Two-Stage Prediction-Aware Contrastive Learning Framework for Multi-Intent NLU

Guanhua Chen, Yutong Yao, Derek F. Wong, Lidia S. Chao

TL;DR

Multi-intent NLU faces challenges from overlapping intents and data scarcity. The authors propose a two-stage Prediction-Aware Contrastive Learning (PACL) framework that combines word-level pre-training with a prediction-aware contrastive fine-tuning stage, including dynamic role assignment and probability-weighted losses, to better exploit shared-intent information. An intent-slot attention module strengthens the coupling between multi-label intent detection and slot filling, yielding a more discriminative embedding space. On MixATIS, MixSNIPS, and StanfordLU, PACL outperforms strong baselines in both low-data and full-data regimes and accelerates convergence, with ablations confirming the contribution of each component, albeit with higher training cost due to contrastive learning.

Abstract

Multi-intent natural language understanding (NLU) presents a formidable challenge due to the model confusion arising from multiple intents within a single utterance. While previous works train the model contrastively to increase the margin between different multi-intent labels, they are less suited to the nuances of multi-intent NLU. They ignore the rich information between the shared intents, which is beneficial to constructing a better embedding space, especially in low-data scenarios. We introduce a two-stage Prediction-Aware Contrastive Learning (PACL) framework for multi-intent NLU to harness this valuable knowledge. Our approach capitalizes on shared intent information by integrating word-level pre-training and prediction-aware contrastive fine-tuning. We construct a pre-training dataset using a word-level data augmentation strategy. Subsequently, our framework dynamically assigns roles to instances during contrastive fine-tuning while introducing a prediction-aware contrastive loss to maximize the impact of contrastive learning. We present experimental results and empirical analysis conducted on three widely used datasets, demonstrating that our method surpasses the performance of three prominent baselines on both low-data and full-data scenarios.

A Two-Stage Prediction-Aware Contrastive Learning Framework for Multi-Intent NLU

TL;DR

Multi-intent NLU faces challenges from overlapping intents and data scarcity. The authors propose a two-stage Prediction-Aware Contrastive Learning (PACL) framework that combines word-level pre-training with a prediction-aware contrastive fine-tuning stage, including dynamic role assignment and probability-weighted losses, to better exploit shared-intent information. An intent-slot attention module strengthens the coupling between multi-label intent detection and slot filling, yielding a more discriminative embedding space. On MixATIS, MixSNIPS, and StanfordLU, PACL outperforms strong baselines in both low-data and full-data regimes and accelerates convergence, with ablations confirming the contribution of each component, albeit with higher training cost due to contrastive learning.

Abstract

Multi-intent natural language understanding (NLU) presents a formidable challenge due to the model confusion arising from multiple intents within a single utterance. While previous works train the model contrastively to increase the margin between different multi-intent labels, they are less suited to the nuances of multi-intent NLU. They ignore the rich information between the shared intents, which is beneficial to constructing a better embedding space, especially in low-data scenarios. We introduce a two-stage Prediction-Aware Contrastive Learning (PACL) framework for multi-intent NLU to harness this valuable knowledge. Our approach capitalizes on shared intent information by integrating word-level pre-training and prediction-aware contrastive fine-tuning. We construct a pre-training dataset using a word-level data augmentation strategy. Subsequently, our framework dynamically assigns roles to instances during contrastive fine-tuning while introducing a prediction-aware contrastive loss to maximize the impact of contrastive learning. We present experimental results and empirical analysis conducted on three widely used datasets, demonstrating that our method surpasses the performance of three prominent baselines on both low-data and full-data scenarios.
Paper Structure (25 sections, 9 equations, 6 figures, 5 tables)

This paper contains 25 sections, 9 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: An example of multi-intent NLU task.
  • Figure 2: The overview of our framework. Different shapes indicate completely different samples and different shades of color indicate samples with shared intent.
  • Figure 3: The Intent Accuracy of different numbers of intents trained on low- and high-data of MixATIS dataset.
  • Figure 4: The learning curve of intent accuracy and overall accuracy on MixATIS test set every fixed validation step.
  • Figure 5: The distribution of intent embeddings in MixATIS dataset. The left one is trained on SLIM baseline, and the other one is trained with our PACL method.
  • ...and 1 more figures