Table of Contents
Fetching ...

One Stone, Four Birds: A Comprehensive Solution for QA System Using Supervised Contrastive Learning

Bo Wang, Tsunenori Mine

TL;DR

A unified SCL-based representation learning method is used to efficiently build an intra-class compact and inter-class scattered feature space, facilitating both known intent classification and unknown intent detection and discovery and achieves new state-of-the-art performance across all tasks.

Abstract

This paper presents a novel and comprehensive solution to enhance both the robustness and efficiency of question answering (QA) systems through supervised contrastive learning (SCL). Training a high-performance QA system has become straightforward with pre-trained language models, requiring only a small amount of data and simple fine-tuning. However, despite recent advances, existing QA systems still exhibit significant deficiencies in functionality and training efficiency. We address the functionality issue by defining four key tasks: user input intent classification, out-of-domain input detection, new intent discovery, and continual learning. We then leverage a unified SCL-based representation learning method to efficiently build an intra-class compact and inter-class scattered feature space, facilitating both known intent classification and unknown intent detection and discovery. Consequently, with minimal additional tuning on downstream tasks, our approach significantly improves model efficiency and achieves new state-of-the-art performance across all tasks.

One Stone, Four Birds: A Comprehensive Solution for QA System Using Supervised Contrastive Learning

TL;DR

A unified SCL-based representation learning method is used to efficiently build an intra-class compact and inter-class scattered feature space, facilitating both known intent classification and unknown intent detection and discovery and achieves new state-of-the-art performance across all tasks.

Abstract

This paper presents a novel and comprehensive solution to enhance both the robustness and efficiency of question answering (QA) systems through supervised contrastive learning (SCL). Training a high-performance QA system has become straightforward with pre-trained language models, requiring only a small amount of data and simple fine-tuning. However, despite recent advances, existing QA systems still exhibit significant deficiencies in functionality and training efficiency. We address the functionality issue by defining four key tasks: user input intent classification, out-of-domain input detection, new intent discovery, and continual learning. We then leverage a unified SCL-based representation learning method to efficiently build an intra-class compact and inter-class scattered feature space, facilitating both known intent classification and unknown intent detection and discovery. Consequently, with minimal additional tuning on downstream tasks, our approach significantly improves model efficiency and achieves new state-of-the-art performance across all tasks.
Paper Structure (48 sections, 21 equations, 4 figures, 9 tables, 2 algorithms)

This paper contains 48 sections, 21 equations, 4 figures, 9 tables, 2 algorithms.

Figures (4)

  • Figure 1: The workflow of the proposed comprehensive and adaptive QA system. First, an initial model is trained using current known data (Pre). Then, during daily usage, an inquiry will first be judged to be an answerable input or not (T-2): if so, proceed to the intent detection (T-1) and retrieve the answer; if not, the unknown inquiry will be temporally saved and be clustered later (T-3). Finally, the initial model is re-trained with newly discovered data to obtain the classification ability to new intents (T-4).
  • Figure 2: Diagrams of SCL's representation optimization procedure, and solution for each task. Left: the training and optimization effect by SCL, which gathers same-intent text features together and pushes different features further apart, providing an underlying embedding space for following tasks. Right: the classification/detection processes of T-1 to T-4 after SCL. (Together with Figure \ref{['fig:workflow']} for better understanding)
  • Figure 3: Diagram of the training and evaluation workflow
  • Figure 4: T-SNE text feature space visualization result on BANKING test I - all data (top row) and only OOD data (bottom row)