Generating Hard-Negative Out-of-Scope Data with ChatGPT for Intent Classification

Zhijian Li; Stefan Larson; Kevin Leach

Generating Hard-Negative Out-of-Scope Data with ChatGPT for Intent Classification

Zhijian Li, Stefan Larson, Kevin Leach

TL;DR

It is shown that classifiers struggle to correctly identify hard-negative OOS utterances more than general OOS utterances, and incorporating hard-negative OOS data for training improves model robustness when detecting hard-negative OOS data and general OOS data.

Abstract

Intent classifiers must be able to distinguish when a user's utterance does not belong to any supported intent to avoid producing incorrect and unrelated system responses. Although out-of-scope (OOS) detection for intent classifiers has been studied, previous work has not yet studied changes in classifier performance against hard-negative out-of-scope utterances (i.e., inputs that share common features with in-scope data, but are actually out-of-scope). We present an automated technique to generate hard-negative OOS data using ChatGPT. We use our technique to build five new hard-negative OOS datasets, and evaluate each against three benchmark intent classifiers. We show that classifiers struggle to correctly identify hard-negative OOS utterances more than general OOS utterances. Finally, we show that incorporating hard-negative OOS data for training improves model robustness when detecting hard-negative OOS data and general OOS data. Our technique, datasets, and evaluation address an important void in the field, offering a straightforward and inexpensive way to collect hard-negative OOS data and improve intent classifiers' robustness.

Generating Hard-Negative Out-of-Scope Data with ChatGPT for Intent Classification

TL;DR

Abstract

Paper Structure (23 sections, 4 figures, 4 tables)

This paper contains 23 sections, 4 figures, 4 tables.

Introduction
Related Work
Hard-Negative Data
Data Collection for Intent Classification
Adversarial Examples
Data Collection and Annotation with ChatGPT
Methods
Feature-Mining Keywords
Data Generation with ChatGPT
OOS Verification with ChatGPT
Evaluation
Data
In-Scope Data.
Hard-Negative Out-of-Scope Data.
General Out-of-Scope Data.
...and 8 more sections

Figures (4)

Figure 1: Example exchanges between a user (blue, right side) and a task-driven dialog system for personal finance (grey, left side). The system correctly identifies the user’s utterance as in-scope in ①, and correctly identifies the user's utterance as out-of-scope and gives a valid response in ②. In ③, the system incorrectly identifies the hard-negative OOS user utterance as in-scope and provides an incorrect response.
Figure 2: An overview of the hard-negative OOS generation process, including examples. The third generated utterance is filtered out during the two-step OOS verification.
Figure 3: Results for Banking77 evaluated with BERT. (a) shows the distribution of softmax confidence scores. (b) shows the distribution of energy confidence scores. (c) shows the F1 score of softmax confidence score for hard-negative OOS and general OOS with in-scope at different confidence thresholds.
Figure 4: Distribution of softmax confidence scores for Clinc-150 evaluated with RoBERTa.

Generating Hard-Negative Out-of-Scope Data with ChatGPT for Intent Classification

TL;DR

Abstract

Generating Hard-Negative Out-of-Scope Data with ChatGPT for Intent Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (4)