Contrastive Learning with Negative Sampling Correction
Lu Wang, Chao Du, Pu Zhao, Chuan Luo, Zhangchi Zhu, Bo Qiao, Wei Zhang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
TL;DR
The paper tackles negative sampling bias in contrastive learning by recasting negatives as unlabeled data within a Positive-Unlabeled Learning (PU) framework. It derives a debiased contrastive loss, DeCL, by expressing the negative distribution as a mixture of unlabeled and positive components under the single-training-set and Selected Completely At Random (SCAR) assumptions, and demonstrates that the loss difference to the ideal unbiased loss vanishes as the negative sample size grows: $|\mathcal{L}_{IdealCL}-\mathcal{L}_{DeCL}| \le \tfrac{1}{2\sqrt{N}}(e^{2}-1)$. Empirically, PUCL consistently improves image and graph representation tasks across SimCLR, CMC, MoCo, and InfoGraph baselines, with substantial gains in several settings and robustness to hyperparameters α and c. The approach provides a principled, broadly applicable correction for negative sampling bias that can be integrated with existing CL frameworks, enhancing downstream classification performance on diverse data modalities.
Abstract
As one of the most effective self-supervised representation learning methods, contrastive learning (CL) relies on multiple negative pairs to contrast against each positive pair. In the standard practice of contrastive learning, data augmentation methods are utilized to generate both positive and negative pairs. While existing works have been focusing on improving the positive sampling, the negative sampling process is often overlooked. In fact, the generated negative samples are often polluted by positive samples, which leads to a biased loss and performance degradation. To correct the negative sampling bias, we propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL). PUCL treats the generated negative samples as unlabeled samples and uses information from positive samples to correct bias in contrastive loss. We prove that the corrected loss used in PUCL only incurs a negligible bias compared to the unbiased contrastive loss. PUCL can be applied to general contrastive learning problems and outperforms state-of-the-art methods on various image and graph classification tasks. The code of PUCL is in the supplementary file.
