Table of Contents
Fetching ...

IDALC: A Semi-Supervised Framework for Intent Detection and Active Learning based Correction

Ankan Mullick, Sukannya Purkayastha, Saransh Sharma, Pawan Goyal, Niloy Ganguly

TL;DR

IDALC introduces a semi-supervised framework that fuses Intent Detection with Active Learning based Correction to handle rejected utterances and discover new intents with minimal annotation. It uses a two-stage architecture (ID and ALC) with a DOC-based out-of-domain detector and a majority-voting auto-correction module, enabling iterative retraining on expanded labeled data. Experiments on SNIPS, Facebook multilingual, and ATIS show consistent gains of $5-10%$ in accuracy and $4-8%$ in macro-F1, while annotation costs remain at $6-10%$ of the unlabeled pool. The approach is lightweight, language-agnostic, and suitable for real-time or edge deployment, offering a practical path to scalable, adaptive dialog systems with reduced labeling overhead.

Abstract

Voice-controlled dialog systems have become immensely popular due to their ability to perform a wide range of actions in response to diverse user queries. These agents possess a predefined set of skills or intents to fulfill specific user tasks. But every system has its own limitations. There are instances where, even for known intents, if any model exhibits low confidence, it results in rejection of utterances that necessitate manual annotation. Additionally, as time progresses, there may be a need to retrain these agents with new intents from the system-rejected queries to carry out additional tasks. Labeling all these emerging intents and rejected utterances over time is impractical, thus calling for an efficient mechanism to reduce annotation costs. In this paper, we introduce IDALC (Intent Detection and Active Learning based Correction), a semi-supervised framework designed to detect user intents and rectify system-rejected utterances while minimizing the need for human annotation. Empirical findings on various benchmark datasets demonstrate that our system surpasses baseline methods, achieving a 5-10% higher accuracy and a 4-8% improvement in macro-F1. Remarkably, we maintain the overall annotation cost at just 6-10% of the unlabelled data available to the system. The overall framework of IDALC is shown in Fig. 1

IDALC: A Semi-Supervised Framework for Intent Detection and Active Learning based Correction

TL;DR

IDALC introduces a semi-supervised framework that fuses Intent Detection with Active Learning based Correction to handle rejected utterances and discover new intents with minimal annotation. It uses a two-stage architecture (ID and ALC) with a DOC-based out-of-domain detector and a majority-voting auto-correction module, enabling iterative retraining on expanded labeled data. Experiments on SNIPS, Facebook multilingual, and ATIS show consistent gains of in accuracy and in macro-F1, while annotation costs remain at of the unlabeled pool. The approach is lightweight, language-agnostic, and suitable for real-time or edge deployment, offering a practical path to scalable, adaptive dialog systems with reduced labeling overhead.

Abstract

Voice-controlled dialog systems have become immensely popular due to their ability to perform a wide range of actions in response to diverse user queries. These agents possess a predefined set of skills or intents to fulfill specific user tasks. But every system has its own limitations. There are instances where, even for known intents, if any model exhibits low confidence, it results in rejection of utterances that necessitate manual annotation. Additionally, as time progresses, there may be a need to retrain these agents with new intents from the system-rejected queries to carry out additional tasks. Labeling all these emerging intents and rejected utterances over time is impractical, thus calling for an efficient mechanism to reduce annotation costs. In this paper, we introduce IDALC (Intent Detection and Active Learning based Correction), a semi-supervised framework designed to detect user intents and rectify system-rejected utterances while minimizing the need for human annotation. Empirical findings on various benchmark datasets demonstrate that our system surpasses baseline methods, achieving a 5-10% higher accuracy and a 4-8% improvement in macro-F1. Remarkably, we maintain the overall annotation cost at just 6-10% of the unlabelled data available to the system. The overall framework of IDALC is shown in Fig. 1

Paper Structure

This paper contains 22 sections, 5 figures, 8 tables, 3 algorithms.

Figures (5)

  • Figure 1: Overview of IDALC Framework: The system processes user utterances, identifies known intents with high confidence, and flags low-confidence or unknown ones for correction
  • Figure 2: End-to-end framework of IDALC (Intent Detection and Active Learning-based Correction) is illustrated using predicted (P1–P5) and actual intent labels (A1–A5). Initially, only two intents, RateBook and AddToPlayList, are known. Q1 is correctly classified. Q2–Q5 have unknown intents. In Cycle 0, Q2 is correctly classified during the Intent Detection (ID) step. Q3 is incorrectly predicted as 'AddToPlayList' but belongs to a new intent: 'PlayMusic'. As the system is unsure, it rejects Q3. In the ALC step, Q3 is corrected, and PlayMusic is added to the list of known intents. In the next cycle, Q4 is correctly classified as SearchScreeningTime. A5 is added to the set of known intents, but Q5 is still misclassified. After each ID step, some previously unknown queries are mapped to known intents. After each ALC step, incorrect predictions are corrected. By the end, some queries may remain misclassified or unrecognized.
  • Figure 3: End-to-end framework of IDALC:Intent Detection and Active Learning based Correction
  • Figure 4: Accuracies (%) for Final ID, Joint Bert, IDALC, Max and Min Uncertainty-AL models
  • Figure 5: Accuracy (%) across different cycles of ALC module, showing iterative improvement and supporting the choice of limiting the process to two cycles for efficiency.