Investigating Self-Supervised Methods for Label-Efficient Learning

Srinivasa Rao Nandam; Sara Atito; Zhenhua Feng; Josef Kittler; Muhammad Awais

Investigating Self-Supervised Methods for Label-Efficient Learning

Srinivasa Rao Nandam, Sara Atito, Zhenhua Feng, Josef Kittler, Muhammad Awais

TL;DR

This work conducts a systematic examination of different self-supervised pretext tasks, namely contrastive learning, clustering, and masked image modelling, to assess their low-shot capabilities by comparing different pretrained models and introduces a framework that combines mask image modelling and clustering as pretext tasks.

Abstract

Vision transformers combined with self-supervised learning have enabled the development of models which scale across large datasets for several downstream tasks like classification, segmentation and detection. The low-shot learning capability of these models, across several low-shot downstream tasks, has been largely under explored. We perform a system level study of different self supervised pretext tasks, namely contrastive learning, clustering, and masked image modelling for their low-shot capabilities by comparing the pretrained models. In addition we also study the effects of collapse avoidance methods, namely centring, ME-MAX, sinkhorn, on these downstream tasks. Based on our detailed analysis, we introduce a framework involving both mask image modelling and clustering as pretext tasks, which performs better across all low-shot downstream tasks, including multi-class classification, multi-label classification and semantic segmentation. Furthermore, when testing the model on full scale datasets, we show performance gains in multi-class classification, multi-label classification and semantic segmentation.

Investigating Self-Supervised Methods for Label-Efficient Learning

TL;DR

Abstract

Investigating Self-Supervised Methods for Label-Efficient Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)