PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation Learning

Xiaoyu Liu; Beitong Zhou; Zuogong Yue; Cheng Cheng

PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation Learning

Xiaoyu Liu, Beitong Zhou, Zuogong Yue, Cheng Cheng

TL;DR

An end-to-end PLReMix framework is proposed by introducing a Pseudo-Label Relaxed (PLR) contrastive loss that constructs a reliable negative set of each sample by filtering out its inappropriate negative pairs, alleviating the loss conflicts by trivially combining these losses.

Abstract

Recently, the usage of Contrastive Representation Learning (CRL) as a pre-training technique improves the performance of learning with noisy labels (LNL) methods. However, instead of pre-training, when trivially combining CRL loss with LNL methods as an end-to-end framework, the empirical experiments show severe degeneration of the performance. We verify through experiments that this issue is caused by optimization conflicts of losses and propose an end-to-end \textbf{PLReMix} framework by introducing a Pseudo-Label Relaxed (PLR) contrastive loss. This PLR loss constructs a reliable negative set of each sample by filtering out its inappropriate negative pairs, alleviating the loss conflicts by trivially combining these losses. The proposed PLR loss is pluggable and we have integrated it into other LNL methods, observing their improved performance. Furthermore, a two-dimensional Gaussian Mixture Model is adopted to distinguish clean and noisy samples by leveraging semantic information and model outputs simultaneously. Experiments on multiple benchmark datasets demonstrate the effectiveness of the proposed method. Code is available at \url{https://github.com/lxysl/PLReMix}.

PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation Learning

TL;DR

Abstract

Paper Structure (15 sections, 14 equations, 5 figures, 7 tables)

This paper contains 15 sections, 14 equations, 5 figures, 7 tables.

Introduction
Related Work
Learning with noisy labels
Contrastive representation learning
Proposed Method
Overview
Joint sample selection
Pseudo-Label Relaxed contrastive representation learning
Semi-supervised training
Experiments
Experimental settings
Experimental results
PLR as a pluggable component
Analysis
Conclusion and Future Work

Figures (5)

Figure 1: Left: Model performance suffers from trivially combining contrastive representation learning with supervised learning. The performance of DivideMix li2019dividemix (a supervised LNL method) boosts from using SimCLR chen2020simple pre-trained weights ($\mathrm{DivideMix}^\dag$), whereas suffers from being trivially combined with SimCLR contrastive loss ($\mathrm{DivideMix \ w/ \ SimCLR}$). While DivideMix with our proposed PLR contrastive loss ($\mathrm{DivideMix \ w/ \ PLR}$) achieves comparable results to $\mathrm{DivideMix}^\dag$. Right: Our proposed PLR contrastive loss selects a set of reliable negative pairs for each sample. The negative pair should have no overlap with the sample's prediction probabilities at $\mathrm{top}_\kappa$ indices ($\kappa = 2$ in Figure). As an instance, $x_k$ is a reliable negative pair of $x_i$ as $\mathrm{top}_2^{(i)} \cap \mathrm{top}_2^{(k)} = \emptyset$, while $x_j$ is an inappropriate negative pair of $x_i$ as $\mathrm{top}_2^{(i)} \cap \mathrm{top}_2^{(j)} \neq \emptyset$.
Figure 2: Left: Overview of the proposed method. (i) We perform joint sample selection on the first network to divide clean and noisy samples with 2d GMM. (ii) We train the second network with proposed Pseudo-Label Relaxed contrastive loss $\mathcal{L}^{\mathrm{PLR}}$ with the constructed reliable negative set $\mathcal{N}$ alleviating the conflict between the supervised learning and CRL. (iii) We utilize semi-supervised training for the second network on the clean and noisy samples divided by 2d GMM separately. The final learning objective $\mathcal{L}$ comprises two components, namely $\mathcal{L}^{\mathrm{PLR}}$ and $\mathcal{L}^{\mathrm{SST}}$. Right: DivideMix and LNL frameworks utilizing CRL based on it. (a) DivideMix li2019dividemix iteratively performs sample selection (1d GMM) and semi-supervised training (SST). (b) The two-stage method zheltonozhskii2022contrastghosh2021contrastivezhang2020decoupling performs self-supervised pretraining before LNL. (c) The three-stage method sachdeva2023scanmix performs self-supervised pretraining and clustering pretraining before LNL. (d) By proposing PLR loss, our method unifies SST and CRL in an end-to-end PLReMix framework.
Figure 3: Gradient conflicts of two contrastive representation learning methods (vanilla SimCLR and our proposed PLR) when they are jointly trained with semi-supervised learning in the presence of noisy labels.
Figure 4: Visualization of the 2d GMM fitted on the normalized loss distribution $\left\{\left(\boldsymbol{l}_{cls}, \boldsymbol{l}_{proto}\right)\right\}$. The first and second rows are obtained from experiments on the CIFAR-10 and CIFAR-100, respectively.
Figure 5: Negative sample selection when training with FlatPLR. Left: Negative pairs selected ratio. Right: Correct negative pairs selected ratio.

PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation Learning

TL;DR

Abstract

PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)