Expanding continual few-shot learning benchmarks to include recognition of specific instances

Gideon Kowadlo; Abdelrahman Ahmed; Amir Mayan; David Rawlinson

Expanding continual few-shot learning benchmarks to include recognition of specific instances

Gideon Kowadlo, Abdelrahman Ahmed, Amir Mayan, David Rawlinson

TL;DR

The paper broadens the CFSL benchmark by scaling to $N_C=200$ classes and introducing an instance test to probe recognition of specific exemplars under corruption, occlusion, and noise. It evaluates baseline CFSL methods (Pretrain+Tune and ProtoNets) and a replay-augmented variant, finding that learning more classes degrades accuracy, ProtoNets typically perform best, and replay substantially boosts performance, especially for the instance test. The work underscores the value of replay-based consolidation in continual few-shot settings and highlights the need for future exploration of weight adaptation and more realistic, high-resolution imagery. Overall, the expanded CFSL framework provides a harder, more realistic benchmark and suggests replay as a viable strategy for improving continual and instance-level recognition in challenging environments.

Abstract

Continual learning and few-shot learning are important frontiers in progress toward broader Machine Learning (ML) capabilities. Recently, there has been intense interest in combining both. One of the first examples to do so was the Continual few-shot Learning (CFSL) framework of Antoniou et al. arXiv:2004.11967. In this study, we extend CFSL in two ways that capture a broader range of challenges, important for intelligent agent behaviour in real-world conditions. First, we increased the number of classes by an order of magnitude, making the results more comparable to standard continual learning experiments. Second, we introduced an 'instance test' which requires recognition of specific instances of classes -- a capability of animal cognition that is usually neglected in ML. For an initial exploration of ML model performance under these conditions, we selected representative baseline models from the original CFSL work and added a model variant with replay. As expected, learning more classes is more difficult than the original CFSL experiments, and interestingly, the way in which image instances and classes are presented affects classification performance. Surprisingly, accuracy in the baseline instance test is comparable to other classification tasks, but poor given significant occlusion and noise. The use of replay for consolidation substantially improves performance for both types of tasks, but particularly for the instance test.

Expanding continual few-shot learning benchmarks to include recognition of specific instances

TL;DR

The paper broadens the CFSL benchmark by scaling to

classes and introducing an instance test to probe recognition of specific exemplars under corruption, occlusion, and noise. It evaluates baseline CFSL methods (Pretrain+Tune and ProtoNets) and a replay-augmented variant, finding that learning more classes degrades accuracy, ProtoNets typically perform best, and replay substantially boosts performance, especially for the instance test. The work underscores the value of replay-based consolidation in continual few-shot settings and highlights the need for future exploration of weight adaptation and more realistic, high-resolution imagery. Overall, the expanded CFSL framework provides a harder, more realistic benchmark and suggests replay as a viable strategy for improving continual and instance-level recognition in challenging environments.

Abstract

Paper Structure (5 sections, 9 figures, 10 tables)

This paper contains 5 sections, 9 figures, 10 tables.

CFSL framework baseline bugfixes
Pretrain+Tune method
Scaling test
Optimised hyperparameters
Fine-tuning steps

Figures (9)

Figure 1: Visual representation of CFSL experiment parameterisation. Reproduced with permission from Antoniou2020.
Figure 2: Wide vs Deep. An illustration of Wide vs Deep experiments. Wide have big support sets and few tasks, Deep has small support sets and many tasks.
Figure 3: Instance test: a) Two configurations with 8 instances. In the first configuration, there are 4 support sets ($S_n$), in the second there are 2. The test set (T) consists of the same instances that were shown throughout the support sets. For each test sample, the challenge is to identify the matching identical instance from the support sets, amongst other highly similar instances. NSS = Number of support sets, and $k$-shot = the number of instances that are shown for a given class. The samples are colour coded to show which ones are identical. b) An example of images with added noise and occlusion (the fraction of pixels for noise and width of image for occlusion is 30%)
Figure 4: Pretrain+Tune Training and Evaluation: Different heads are used for pretraining and each task, because the number of classes varies between pretraining and task settings. All trainable parameters in the head and VGG blocks are adapted during pretraining and fine-tuning.
Figure 5: Learning with replay. Complementary Learning Systems (CLS) setup with Long Term Memory (LTM), paired with a circular buffer Short Term Memory (STM). First, in a memorisation step, the STM temporarily stores recent support sets. Second, in a recall step, the memorised data are used in LTM training.
...and 4 more figures

Expanding continual few-shot learning benchmarks to include recognition of specific instances

TL;DR

Abstract

Expanding continual few-shot learning benchmarks to include recognition of specific instances

Authors

TL;DR

Abstract

Table of Contents

Figures (9)