Rethinking Positive Pairs in Contrastive Learning
Jiantao Wu, Sara Atito, Zhenhua Feng, Shentong Mo, Josef Kitler, Muhammad Awais
TL;DR
Rethinking Positive Pairs in Contrastive Learning introduces SimLAP, a universal contrastive learning framework that learns visual representations from arbitrary class pairs by discovering pair-specific subspaces. It employs a feature filter to generate gates, creating subspace activations so that the contrastive loss operates only on shared features, with an additional Gate Penalty to encourage binary-like gates. The method integrates InfoNCE within the subspaces and demonstrates strong transfer performance across six tasks, scalable results with ViT architectures, and resilience against dimensional collapse, corroborated by embedding visualizations and Grad-CAM analyses. This approach broadens the design space of positives in contrastive learning, enabling robust, transferable representations for large, diverse label sets, though it faces challenges in interpretability and scaling to very large label spaces due to quadratic pair growth.
Abstract
The training methods in AI do involve semantically distinct pairs of samples. However, their role typically is to enhance the between class separability. The actual notion of similarity is normally learned from semantically identical pairs. This paper presents SimLAP: a simple framework for learning visual representation from arbitrary pairs. SimLAP explores the possibility of learning similarity from semantically distinct sample pairs. The approach is motivated by the observation that for any pair of classes there exists a subspace in which semantically distinct samples exhibit similarity. This phenomenon can be exploited for a novel method of learning, which optimises the similarity of an arbitrary pair of samples, while simultaneously learning the enabling subspace. The feasibility of the approach will be demonstrated experimentally and its merits discussed.
