Expanding continual few-shot learning benchmarks to include recognition of specific instances
Gideon Kowadlo, Abdelrahman Ahmed, Amir Mayan, David Rawlinson
TL;DR
The paper broadens the CFSL benchmark by scaling to $N_C=200$ classes and introducing an instance test to probe recognition of specific exemplars under corruption, occlusion, and noise. It evaluates baseline CFSL methods (Pretrain+Tune and ProtoNets) and a replay-augmented variant, finding that learning more classes degrades accuracy, ProtoNets typically perform best, and replay substantially boosts performance, especially for the instance test. The work underscores the value of replay-based consolidation in continual few-shot settings and highlights the need for future exploration of weight adaptation and more realistic, high-resolution imagery. Overall, the expanded CFSL framework provides a harder, more realistic benchmark and suggests replay as a viable strategy for improving continual and instance-level recognition in challenging environments.
Abstract
Continual learning and few-shot learning are important frontiers in progress toward broader Machine Learning (ML) capabilities. Recently, there has been intense interest in combining both. One of the first examples to do so was the Continual few-shot Learning (CFSL) framework of Antoniou et al. arXiv:2004.11967. In this study, we extend CFSL in two ways that capture a broader range of challenges, important for intelligent agent behaviour in real-world conditions. First, we increased the number of classes by an order of magnitude, making the results more comparable to standard continual learning experiments. Second, we introduced an 'instance test' which requires recognition of specific instances of classes -- a capability of animal cognition that is usually neglected in ML. For an initial exploration of ML model performance under these conditions, we selected representative baseline models from the original CFSL work and added a model variant with replay. As expected, learning more classes is more difficult than the original CFSL experiments, and interestingly, the way in which image instances and classes are presented affects classification performance. Surprisingly, accuracy in the baseline instance test is comparable to other classification tasks, but poor given significant occlusion and noise. The use of replay for consolidation substantially improves performance for both types of tasks, but particularly for the instance test.
