Regularizing with Pseudo-Negatives for Continual Self-Supervised Learning
Sungmin Cha, Kyunghyun Cho, Taesup Moon
TL;DR
This work tackles forgetting in Continual Self-Supervised Learning (CSSL) by proposing Pseudo-Negative Regularization (PNR), which introduces pseudo-negatives derived from both current and past models to regulate SSL losses. For InfoNCE-based methods, PNR defines two symmetric losses that incorporate pseudo-negatives, enabling improved plasticity and stability; for non-contrastive methods, PNR regularizes with pseudo-negatives constructed from different augmentations of the past model’s outputs. Across extensive experiments on CIFAR-100, ImageNet-100, DomainNet, and ImageNet-1k, PNR consistently improves representation quality and down-stream linear probe performance, achieving state-of-the-art results in several CSSL scenarios while maintaining stability and plasticity. The approach also demonstrates robustness across multiple SSL backbones and tasks, with additional analysis of ablations and queue-size effects. Limitations include current focus on CNNs and vision domains, suggesting future work to extend to transformers and NLP settings.
Abstract
We introduce a novel Pseudo-Negative Regularization (PNR) framework for effective continual self-supervised learning (CSSL). Our PNR leverages pseudo-negatives obtained through model-based augmentation in a way that newly learned representations may not contradict what has been learned in the past. Specifically, for the InfoNCE-based contrastive learning methods, we define symmetric pseudo-negatives obtained from current and previous models and use them in both main and regularization loss terms. Furthermore, we extend this idea to non-contrastive learning methods which do not inherently rely on negatives. For these methods, a pseudo-negative is defined as the output from the previous model for a differently augmented version of the anchor sample and is asymmetrically applied to the regularization term. Extensive experimental results demonstrate that our PNR framework achieves state-of-the-art performance in representation learning during CSSL by effectively balancing the trade-off between plasticity and stability.
