Table of Contents
Fetching ...

ALVIN: Active Learning Via INterpolation

Michalis Korakakis, Andreas Vlachos, Adrian Weller

TL;DR

Experimental results on six datasets encompassing sentiment analysis, natural language inference, and paraphrase detection demonstrate that ALVIN outperforms state-of-the-art active learning methods in both in-distribution and out-of-distribution generalization.

Abstract

Active Learning aims to minimize annotation effort by selecting the most useful instances from a pool of unlabeled data. However, typical active learning methods overlook the presence of distinct example groups within a class, whose prevalence may vary, e.g., in occupation classification datasets certain demographics are disproportionately represented in specific classes. This oversight causes models to rely on shortcuts for predictions, i.e., spurious correlations between input attributes and labels occurring in well-represented groups. To address this issue, we propose Active Learning Via INterpolation (ALVIN), which conducts intra-class interpolations between examples from under-represented and well-represented groups to create anchors, i.e., artificial points situated between the example groups in the representation space. By selecting instances close to the anchors for annotation, ALVIN identifies informative examples exposing the model to regions of the representation space that counteract the influence of shortcuts. Crucially, since the model considers these examples to be of high certainty, they are likely to be ignored by typical active learning methods. Experimental results on six datasets encompassing sentiment analysis, natural language inference, and paraphrase detection demonstrate that ALVIN outperforms state-of-the-art active learning methods in both in-distribution and out-of-distribution generalization.

ALVIN: Active Learning Via INterpolation

TL;DR

Experimental results on six datasets encompassing sentiment analysis, natural language inference, and paraphrase detection demonstrate that ALVIN outperforms state-of-the-art active learning methods in both in-distribution and out-of-distribution generalization.

Abstract

Active Learning aims to minimize annotation effort by selecting the most useful instances from a pool of unlabeled data. However, typical active learning methods overlook the presence of distinct example groups within a class, whose prevalence may vary, e.g., in occupation classification datasets certain demographics are disproportionately represented in specific classes. This oversight causes models to rely on shortcuts for predictions, i.e., spurious correlations between input attributes and labels occurring in well-represented groups. To address this issue, we propose Active Learning Via INterpolation (ALVIN), which conducts intra-class interpolations between examples from under-represented and well-represented groups to create anchors, i.e., artificial points situated between the example groups in the representation space. By selecting instances close to the anchors for annotation, ALVIN identifies informative examples exposing the model to regions of the representation space that counteract the influence of shortcuts. Crucially, since the model considers these examples to be of high certainty, they are likely to be ignored by typical active learning methods. Experimental results on six datasets encompassing sentiment analysis, natural language inference, and paraphrase detection demonstrate that ALVIN outperforms state-of-the-art active learning methods in both in-distribution and out-of-distribution generalization.

Paper Structure

This paper contains 31 sections, 1 equation, 2 figures, 8 tables.

Figures (2)

  • Figure 1: Illustration of ALVIN applied to a binary classification task. indicates well-represented, labeled examples in Class A, indicates under-represented, labeled examples in Class A, indicates labeled examples in Class B, indicates unlabeled instances, and indicates the anchors created via intra-class interpolations between under-represented and well-represented examples. Unlike typical active learning methods, ALVIN prioritizes high-certainty instances that integrate representations from different example groups at varied proportions. This approach enables ALVIN to adjust the model's decision boundary and mitigate its reliance on shortcuts.
  • Figure 2: Effects of different components of ALVIN and hyperparameter adjustments on both in-distribution (ID) and out-of-distribution (OOD) performance. Experiments are conducted on the IMDB dataset using 10% of the acquired data.