Frozen Feature Augmentation for Few-Shot Image Classification
Andreas Bär, Neil Houlsby, Mostafa Dehghani, Manoj Kumar
TL;DR
This work shows that augmenting frozen vision-transformer features, rather than input images, with carefully designed FroFA transformations can improve few-shot image classification. By mapping features to a surrogate image-like space and applying per-channel and sequential augmentations, the authors identify brightness- and other stylistic transformations as the most effective, while geometric changes often harm performance on ImageNet. The study demonstrates consistent gains across multiple architectures and large pretraining datasets, especially on small transfer datasets, and finds that per-channel FroFA and sequential protocols can yield substantial boosts (up to ~7.7% in 1-shot). These results indicate that simple, computation-light feature-space augmentations are a practical route to boosting few-shot transfer when working with frozen representations. The findings also show good generalization across architectures and pretraining setups, suggesting broad applicability for rapid, data-efficient deployment of pretrained vision models.
Abstract
Training a linear classifier or lightweight model on top of pretrained vision model outputs, so-called 'frozen features', leads to impressive performance on a number of downstream few-shot tasks. Currently, frozen features are not modified during training. On the other hand, when networks are trained directly on images, data augmentation is a standard recipe that improves performance with no substantial overhead. In this paper, we conduct an extensive pilot study on few-shot image classification that explores applying data augmentations in the frozen feature space, dubbed 'frozen feature augmentation (FroFA)', covering twenty augmentations in total. Our study demonstrates that adopting a deceptively simple pointwise FroFA, such as brightness, can improve few-shot performance consistently across three network architectures, three large pretraining datasets, and eight transfer datasets.
