Table of Contents
Fetching ...

Meta-Learning for Semi-Supervised Few-Shot Classification

Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua B. Tenenbaum, Hugo Larochelle, Richard S. Zemel

TL;DR

The paper tackles few-shot classification under label scarcity by introducing semi-supervised episodes that include unlabeled data, sometimes with distractors. It extends Prototypical Networks with multiple prototype-refinement techniques—Soft k-Means, a distractor cluster, and a masking mechanism—trained end-to-end to leverage unlabeled information. Empirical evaluations on Omniglot, miniImageNet, and a newly proposed tieredImageNet show consistent improvements over supervised baselines, with Masked Soft k-Means offering robust performance in the presence of distractors. The work also provides a new dataset framework (tieredImageNet) to better study hierarchical class relationships in few-shot learning and demonstrates that unlabeled data can meaningfully enhance meta-learned representations.

Abstract

In few-shot classification, we are interested in learning algorithms that train a classifier from only a handful of labeled examples. Recent progress in few-shot classification has featured meta-learning, in which a parameterized model for a learning algorithm is defined and trained on episodes representing different classification problems, each with a small labeled training set and its corresponding test set. In this work, we advance this few-shot classification paradigm towards a scenario where unlabeled examples are also available within each episode. We consider two situations: one where all unlabeled examples are assumed to belong to the same set of classes as the labeled examples of the episode, as well as the more challenging situation where examples from other distractor classes are also provided. To address this paradigm, we propose novel extensions of Prototypical Networks (Snell et al., 2017) that are augmented with the ability to use unlabeled examples when producing prototypes. These models are trained in an end-to-end way on episodes, to learn to leverage the unlabeled examples successfully. We evaluate these methods on versions of the Omniglot and miniImageNet benchmarks, adapted to this new framework augmented with unlabeled examples. We also propose a new split of ImageNet, consisting of a large set of classes, with a hierarchical structure. Our experiments confirm that our Prototypical Networks can learn to improve their predictions due to unlabeled examples, much like a semi-supervised algorithm would.

Meta-Learning for Semi-Supervised Few-Shot Classification

TL;DR

The paper tackles few-shot classification under label scarcity by introducing semi-supervised episodes that include unlabeled data, sometimes with distractors. It extends Prototypical Networks with multiple prototype-refinement techniques—Soft k-Means, a distractor cluster, and a masking mechanism—trained end-to-end to leverage unlabeled information. Empirical evaluations on Omniglot, miniImageNet, and a newly proposed tieredImageNet show consistent improvements over supervised baselines, with Masked Soft k-Means offering robust performance in the presence of distractors. The work also provides a new dataset framework (tieredImageNet) to better study hierarchical class relationships in few-shot learning and demonstrates that unlabeled data can meaningfully enhance meta-learned representations.

Abstract

In few-shot classification, we are interested in learning algorithms that train a classifier from only a handful of labeled examples. Recent progress in few-shot classification has featured meta-learning, in which a parameterized model for a learning algorithm is defined and trained on episodes representing different classification problems, each with a small labeled training set and its corresponding test set. In this work, we advance this few-shot classification paradigm towards a scenario where unlabeled examples are also available within each episode. We consider two situations: one where all unlabeled examples are assumed to belong to the same set of classes as the labeled examples of the episode, as well as the more challenging situation where examples from other distractor classes are also provided. To address this paradigm, we propose novel extensions of Prototypical Networks (Snell et al., 2017) that are augmented with the ability to use unlabeled examples when producing prototypes. These models are trained in an end-to-end way on episodes, to learn to leverage the unlabeled examples successfully. We evaluate these methods on versions of the Omniglot and miniImageNet benchmarks, adapted to this new framework augmented with unlabeled examples. We also propose a new split of ImageNet, consisting of a large set of classes, with a hierarchical structure. Our experiments confirm that our Prototypical Networks can learn to improve their predictions due to unlabeled examples, much like a semi-supervised algorithm would.

Paper Structure

This paper contains 22 sections, 9 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Consider a setup where the aim is to learn a classifier to distinguish between two previously unseen classes, goldfish and shark, given not only labeled examples of these two classes, but also a larger pool of unlabeled examples, some of which may belong to one of these two classes of interest. In this work we aim to move a step closer to this more natural learning framework by incorporating in our learning episodes unlabeled data from the classes we aim to learn representations for (shown with dashed red borders) as well as from distractor classes .
  • Figure 2: Example of the semi-supervised few-shot learning setup. Training involves iterating through training episodes, consisting of a support set $\mathcal{S}$, an unlabeled set $\mathcal{R}$, and a query set $\mathcal{Q}$. The goal is to use the labeled items (shown with their numeric class label) in $\mathcal{S}$ and the unlabeled items in $\mathcal{R}$ within each episode to generalize to good performance on the corresponding query set. The unlabeled items in $\mathcal{R}$ may either be pertinent to the classes we are considering (shown above with green plus signs) or they may be distractor items which belong to a class that is not relevant to the current episode (shown with red minus signs). However note that the model does not actually have ground truth information as to whether each unlabeled example is a distractor or not; the plus/minus signs are shown only for illustrative purposes. At test time, we are given new episodes consisting of novel classes not seen during training that we use to evaluate the meta-learning method.
  • Figure 3: Left: The prototypes are initialized based on the mean location of the examples of the corresponding class, as in ordinary Prototypical Networks. Support, unlabeled, and query examples have solid, dashed, and white colored borders respectively. Right: The refined prototypes obtained by incorporating the unlabeled examples, which classifies all query examples correctly.
  • Figure 4: Model Performance on tieredImageNet with different numbers of unlabeled items during test time.
  • Figure 5: Hierarchy of tieredImagenet categories. Training categories are highlighted in red and test categories in blue. Each category indicates the number of associated classes from ILSVRC-12. Best viewed zoomed-in on electronic version.
  • ...and 2 more figures