Table of Contents
Fetching ...

All4One: Symbiotic Neighbour Contrastive Learning via Self-Attention and Redundancy Reduction

Imanol G. Estepa, Ignacio Sarasúa, Bhalaji Nagarajan, Petia Radeva

TL;DR

All4One tackles inefficiencies in neighbour-based SSL by summarizing contextual information from multiple neighbours into centroid representations via a self-attention transformer and combining this with a feature redundancy reduction objective. The method unifies three objectives—Neighbour Contrast, Centroid Contrast, and Feature Contrast—into a single All4One loss, enabling efficient yet rich representation learning. Empirical results show All4One achieves state-of-the-art linear evaluation on CIFAR-10/100 and ImageNet-100 and competitive performance on full ImageNet, with demonstrated robustness to embedding dimensionality and weaker augmentations. The approach offers practical impact by improving generalization in low-data and low-dimensional settings and provides a flexible framework for extending SSL to broader backbones and downstream tasks.

Abstract

Nearest neighbour based methods have proved to be one of the most successful self-supervised learning (SSL) approaches due to their high generalization capabilities. However, their computational efficiency decreases when more than one neighbour is used. In this paper, we propose a novel contrastive SSL approach, which we call All4One, that reduces the distance between neighbour representations using ''centroids'' created through a self-attention mechanism. We use a Centroid Contrasting objective along with single Neighbour Contrasting and Feature Contrasting objectives. Centroids help in learning contextual information from multiple neighbours whereas the neighbour contrast enables learning representations directly from the neighbours and the feature contrast allows learning representations unique to the features. This combination enables All4One to outperform popular instance discrimination approaches by more than 1% on linear classification evaluation for popular benchmark datasets and obtains state-of-the-art (SoTA) results. Finally, we show that All4One is robust towards embedding dimensionalities and augmentations, surpassing NNCLR and Barlow Twins by more than 5% on low dimensionality and weak augmentation settings. The source code would be made available soon.

All4One: Symbiotic Neighbour Contrastive Learning via Self-Attention and Redundancy Reduction

TL;DR

All4One tackles inefficiencies in neighbour-based SSL by summarizing contextual information from multiple neighbours into centroid representations via a self-attention transformer and combining this with a feature redundancy reduction objective. The method unifies three objectives—Neighbour Contrast, Centroid Contrast, and Feature Contrast—into a single All4One loss, enabling efficient yet rich representation learning. Empirical results show All4One achieves state-of-the-art linear evaluation on CIFAR-10/100 and ImageNet-100 and competitive performance on full ImageNet, with demonstrated robustness to embedding dimensionality and weaker augmentations. The approach offers practical impact by improving generalization in low-data and low-dimensional settings and provides a flexible framework for extending SSL to broader backbones and downstream tasks.

Abstract

Nearest neighbour based methods have proved to be one of the most successful self-supervised learning (SSL) approaches due to their high generalization capabilities. However, their computational efficiency decreases when more than one neighbour is used. In this paper, we propose a novel contrastive SSL approach, which we call All4One, that reduces the distance between neighbour representations using ''centroids'' created through a self-attention mechanism. We use a Centroid Contrasting objective along with single Neighbour Contrasting and Feature Contrasting objectives. Centroids help in learning contextual information from multiple neighbours whereas the neighbour contrast enables learning representations directly from the neighbours and the feature contrast allows learning representations unique to the features. This combination enables All4One to outperform popular instance discrimination approaches by more than 1% on linear classification evaluation for popular benchmark datasets and obtains state-of-the-art (SoTA) results. Finally, we show that All4One is robust towards embedding dimensionalities and augmentations, surpassing NNCLR and Barlow Twins by more than 5% on low dimensionality and weak augmentation settings. The source code would be made available soon.
Paper Structure (14 sections, 4 equations, 9 figures, 4 tables)

This paper contains 14 sections, 4 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Simplified architecture of All4One. All4One uses three different objective functions that contrast different representations: Centroid objective contrasts the contextual information extracted from multiple neighbours while the Neighbour objective assures diversity dwibedi_little_2021. Additionally, the Feature contrast objective measures the correlation of the generated features and increases their independence.
  • Figure 2: Neighbour contrast comparison. While the common neighbour contrastive approaches only contrast the first neighbour, we create representations that contain contextual information from the k NNs and contrast it in a single objective computation.
  • Figure 3: Top-1 NN retrieval accuracy comparison.
  • Figure 4: NN extractions performed by All4One. More examples in Appendix D
  • Figure 5: Complete architecture of All4One framework. Feature, Centroid and Neighbour contrast objective functions are indicated by red, purple, and green respectively.
  • ...and 4 more figures