Overfitting In Contrastive Learning?
Zachary Rabin, Jim Davis, Benjamin Lewis, Matthew Scherreik
TL;DR
This paper investigates overfitting in unsupervised contrastive learning, focusing on the SimCLR framework. It demonstrates that, with sufficiently long training, the method can overfit, and that this overfitting is predominantly driven by the positive similarity component of the loss. Through training on a CIFAR-10 subset and tracking both training and validation trajectories, the authors show that continued optimization eventually harms generalization as positive similarity on validation data increases while negative similarity continues to decrease. The work suggests practical early-stopping strategies based on the positive similarity signal and provides insights into shaping the feature space in unsupervised contrastive learning for better generalization.
Abstract
Overfitting describes a machine learning phenomenon where the model fits too closely to the training data, resulting in poor generalization. While this occurrence is thoroughly documented for many forms of supervised learning, it is not well examined in the context of unsupervised learning. In this work we examine the nature of overfitting in unsupervised contrastive learning. We show that overfitting can indeed occur and the mechanism behind overfitting.
