Learning New Tasks from a Few Examples with Soft-Label Prototypes

Avyav Kumar Singh; Ekaterina Shutova; Helen Yannakoudakis

Learning New Tasks from a Few Examples with Soft-Label Prototypes

Avyav Kumar Singh, Ekaterina Shutova, Helen Yannakoudakis

TL;DR

This work proposes a novel few-shot learning approach based on soft-label prototypes (SLPs) designed to collectively capture the distribution of different classes across the input domain space and demonstrates that this approach achieves superior performance on the majority of tested tasks in this data-lean setting while being highly parameter efficient.

Abstract

Existing approaches to few-shot learning in NLP rely on large language models (LLMs) and/or fine-tuning of these to generalise on out-of-distribution data. In this work, we propose a novel few-shot learning approach based on soft-label prototypes (SLPs) designed to collectively capture the distribution of different classes across the input domain space. We focus on learning previously unseen NLP tasks from very few examples (4, 8, 16) per class and experimentally demonstrate that our approach achieves superior performance on the majority of tested tasks in this data-lean setting while being highly parameter efficient. We also show that our few-shot adaptation method can be integrated into more generalised learning settings, primarily meta-learning, to yield superior performance against strong baselines.

Learning New Tasks from a Few Examples with Soft-Label Prototypes

TL;DR

Abstract

Paper Structure (57 sections, 2 theorems, 21 equations, 9 figures, 6 tables, 4 algorithms)

This paper contains 57 sections, 2 theorems, 21 equations, 9 figures, 6 tables, 4 algorithms.

Introduction
Related work
Approach: few-shot learning with Soft-Label Prototypes (SLPs)
Generating soft-label prototypes
Finding lines connecting all centroids
Learning soft-label prototypes
Learning via linear constraints (constraintSLP)
Learning via gradient descent (DeepSLP)
Classification with soft-label prototypes
Meta-training DeepSLP (MetaSLP)
Inner-loop training
Outer-loop training
Meta-testing
Experimental settings and datasets
Experimental settings
...and 42 more sections

Key Result

Theorem A.1

The soft-label value of each class within a single soft-label prototype generated using constraintSLP is inversely proportional to its distance from the soft-label prototype along the line connecting all classes captured by it.

Figures (9)

Figure 1: Learning soft-label prototypes using two trainable linear layers (yellow): example for a 3-class prototype. Dotted lines indicate backpropagation.
Figure 2: Training soft-label prototypes in DeepSLP. Class centroids are represented with large circles that lie on a line (Red, Green, Blue), while training set examples are represented with smaller circles of the same colour. Dotted lines represent the backpropagation error, of which the bolded ones represent a larger error per soft-label prototype. Predictions for $x$ are based on the prototypes at each end of the line.
Figure 3: Generating and classifying data with soft-label prototypes.
Figure 4: Classification example with constraintSLP (figure from ilia-1).
Figure 5: Schematic diagram for ascertaining $\theta$ with class centroids $A = (-10,0)$, $B=(5,0)$ and $C=(15,0)$.
...and 4 more figures

Theorems & Definitions (2)

Theorem A.1
Theorem A.2

Learning New Tasks from a Few Examples with Soft-Label Prototypes

TL;DR

Abstract

Learning New Tasks from a Few Examples with Soft-Label Prototypes

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (2)