Table of Contents
Fetching ...

Learning New Tasks from a Few Examples with Soft-Label Prototypes

Avyav Kumar Singh, Ekaterina Shutova, Helen Yannakoudakis

TL;DR

This work proposes a novel few-shot learning approach based on soft-label prototypes (SLPs) designed to collectively capture the distribution of different classes across the input domain space and demonstrates that this approach achieves superior performance on the majority of tested tasks in this data-lean setting while being highly parameter efficient.

Abstract

Existing approaches to few-shot learning in NLP rely on large language models (LLMs) and/or fine-tuning of these to generalise on out-of-distribution data. In this work, we propose a novel few-shot learning approach based on soft-label prototypes (SLPs) designed to collectively capture the distribution of different classes across the input domain space. We focus on learning previously unseen NLP tasks from very few examples (4, 8, 16) per class and experimentally demonstrate that our approach achieves superior performance on the majority of tested tasks in this data-lean setting while being highly parameter efficient. We also show that our few-shot adaptation method can be integrated into more generalised learning settings, primarily meta-learning, to yield superior performance against strong baselines.

Learning New Tasks from a Few Examples with Soft-Label Prototypes

TL;DR

This work proposes a novel few-shot learning approach based on soft-label prototypes (SLPs) designed to collectively capture the distribution of different classes across the input domain space and demonstrates that this approach achieves superior performance on the majority of tested tasks in this data-lean setting while being highly parameter efficient.

Abstract

Existing approaches to few-shot learning in NLP rely on large language models (LLMs) and/or fine-tuning of these to generalise on out-of-distribution data. In this work, we propose a novel few-shot learning approach based on soft-label prototypes (SLPs) designed to collectively capture the distribution of different classes across the input domain space. We focus on learning previously unseen NLP tasks from very few examples (4, 8, 16) per class and experimentally demonstrate that our approach achieves superior performance on the majority of tested tasks in this data-lean setting while being highly parameter efficient. We also show that our few-shot adaptation method can be integrated into more generalised learning settings, primarily meta-learning, to yield superior performance against strong baselines.
Paper Structure (57 sections, 2 theorems, 21 equations, 9 figures, 6 tables, 4 algorithms)

This paper contains 57 sections, 2 theorems, 21 equations, 9 figures, 6 tables, 4 algorithms.

Key Result

Theorem A.1

The soft-label value of each class within a single soft-label prototype generated using constraintSLP is inversely proportional to its distance from the soft-label prototype along the line connecting all classes captured by it.

Figures (9)

  • Figure 1: Learning soft-label prototypes using two trainable linear layers (yellow): example for a 3-class prototype. Dotted lines indicate backpropagation.
  • Figure 2: Training soft-label prototypes in DeepSLP. Class centroids are represented with large circles that lie on a line (Red, Green, Blue), while training set examples are represented with smaller circles of the same colour. Dotted lines represent the backpropagation error, of which the bolded ones represent a larger error per soft-label prototype. Predictions for $x$ are based on the prototypes at each end of the line.
  • Figure 3: Generating and classifying data with soft-label prototypes.
  • Figure 4: Classification example with constraintSLP (figure from ilia-1).
  • Figure 5: Schematic diagram for ascertaining $\theta$ with class centroids $A = (-10,0)$, $B=(5,0)$ and $C=(15,0)$.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Theorem A.1
  • Theorem A.2