Table of Contents
Fetching ...

From Pretext to Purpose: Batch-Adaptive Self-Supervised Learning

Jiansong Zhang, Linlin Shen, Peizhong Liu

TL;DR

The proposed method, via dimensionality reduction and reconstruction of batch data, enables formerly isolated individual data to partake in intra-batch communication through the Embedding Layer, and adaptively amplifies the self-supervised feature encoding capability as the training progresses.

Abstract

In recent years, self-supervised contrastive learning has emerged as a distinguished paradigm in the artificial intelligence landscape. It facilitates unsupervised feature learning through contrastive delineations at the instance level. However, crafting an effective self-supervised paradigm remains a pivotal challenge within this field. This paper delves into two crucial factors impacting self-supervised contrastive learning-bach size and pretext tasks, and from a data processing standpoint, proposes an adaptive technique of batch fusion. The proposed method, via dimensionality reduction and reconstruction of batch data, enables formerly isolated individual data to partake in intra-batch communication through the Embedding Layer. Moreover, it adaptively amplifies the self-supervised feature encoding capability as the training progresses. We conducted a linear classification test of this method based on the classic contrastive learning framework on ImageNet-1k. The empirical findings illustrate that our approach achieves state-of-the-art performance under equitable comparisons. Benefiting from its "plug-and-play" characteristics, we further explored other contrastive learning methods. On the ImageNet-100, compared to the original performance, the top1 has seen a maximum increase of 1.25%. We suggest that the proposed method may contribute to the advancement of data-driven self-supervised learning research, bringing a fresh perspective to this community.

From Pretext to Purpose: Batch-Adaptive Self-Supervised Learning

TL;DR

The proposed method, via dimensionality reduction and reconstruction of batch data, enables formerly isolated individual data to partake in intra-batch communication through the Embedding Layer, and adaptively amplifies the self-supervised feature encoding capability as the training progresses.

Abstract

In recent years, self-supervised contrastive learning has emerged as a distinguished paradigm in the artificial intelligence landscape. It facilitates unsupervised feature learning through contrastive delineations at the instance level. However, crafting an effective self-supervised paradigm remains a pivotal challenge within this field. This paper delves into two crucial factors impacting self-supervised contrastive learning-bach size and pretext tasks, and from a data processing standpoint, proposes an adaptive technique of batch fusion. The proposed method, via dimensionality reduction and reconstruction of batch data, enables formerly isolated individual data to partake in intra-batch communication through the Embedding Layer. Moreover, it adaptively amplifies the self-supervised feature encoding capability as the training progresses. We conducted a linear classification test of this method based on the classic contrastive learning framework on ImageNet-1k. The empirical findings illustrate that our approach achieves state-of-the-art performance under equitable comparisons. Benefiting from its "plug-and-play" characteristics, we further explored other contrastive learning methods. On the ImageNet-100, compared to the original performance, the top1 has seen a maximum increase of 1.25%. We suggest that the proposed method may contribute to the advancement of data-driven self-supervised learning research, bringing a fresh perspective to this community.
Paper Structure (22 sections, 5 equations, 2 figures, 5 tables, 1 algorithm)

This paper contains 22 sections, 5 equations, 2 figures, 5 tables, 1 algorithm.

Figures (2)

  • Figure 1: Overview of the proposed methods in this paper. Negative samples will be converted from 3-channel to matrix by Pacth Partion (blue clipped head). Then the batch is loaded in channel form by fusion consideration into the Embedding (orange clipped head) shown. The output low-level matrix will be remapped to a 3-channel image via Patch Restore (yellow clipping head) before loading into the encoder on the negative sample side.
  • Figure 2: Specific structure of Conv Embedding (CE).