Table of Contents
Fetching ...

SigCLR: Sigmoid Contrastive Learning of Visual Representations

Ömer Veysel Çağatan

TL;DR

SigCLR utilizes the logistic loss that only operates on pairs and does not require a global view as in the cross-entropy loss used in SimCLR, which is a promising replacement for the SimCLR which has shown tremendous success in various domains.

Abstract

We propose SigCLR: Sigmoid Contrastive Learning of Visual Representations. SigCLR utilizes the logistic loss that only operates on pairs and does not require a global view as in the cross-entropy loss used in SimCLR. We show that logistic loss shows competitive performance on CIFAR-10, CIFAR-100, and Tiny-IN compared to other established SSL objectives. Our findings verify the importance of learnable bias as in the case of SigLUP, however, it requires a fixed temperature as in the SimCLR to excel. Overall, SigCLR is a promising replacement for the SimCLR which is ubiquitous and has shown tremendous success in various domains.

SigCLR: Sigmoid Contrastive Learning of Visual Representations

TL;DR

SigCLR utilizes the logistic loss that only operates on pairs and does not require a global view as in the cross-entropy loss used in SimCLR, which is a promising replacement for the SimCLR which has shown tremendous success in various domains.

Abstract

We propose SigCLR: Sigmoid Contrastive Learning of Visual Representations. SigCLR utilizes the logistic loss that only operates on pairs and does not require a global view as in the cross-entropy loss used in SimCLR. We show that logistic loss shows competitive performance on CIFAR-10, CIFAR-100, and Tiny-IN compared to other established SSL objectives. Our findings verify the importance of learnable bias as in the case of SigLUP, however, it requires a fixed temperature as in the SimCLR to excel. Overall, SigCLR is a promising replacement for the SimCLR which is ubiquitous and has shown tremendous success in various domains.

Paper Structure

This paper contains 10 sections, 2 equations, 1 figure, 4 tables, 1 algorithm.

Figures (1)

  • Figure 1: SigCLR follows a highly similar setup as SimCLR chen2020simple. We randomly select two distinct data augmentation operators, denoted as $t\sim \mathcal{T}$ and $t'\sim \mathcal{T}$, from the same family of augmentations from ozsoy2022selfsupervised. These operators are applied independently to each data example, resulting in two correlated views. The training process involves optimizing a base encoder network, denoted as $f(\cdot)$, and a projection head, denoted as $g(\cdot)$, to maximize agreement between the representations produced by the augmented views. This optimization is achieved through the utilization of a sigmoid contrastive loss. Upon completion of training, the projection head is discarded and only encoder embeddings are utilized for evaluations.