Pre-train to Gain: Robust Learning Without Clean Labels

David Szczecina; Nicholas Pellegrino; Paul Fieguth

Pre-train to Gain: Robust Learning Without Clean Labels

David Szczecina, Nicholas Pellegrino, Paul Fieguth

TL;DR

This paper addresses the challenge of training deep networks with noisy labels without relying on a clean data subset. It proposes a two-stage approach where a feature extractor is pre-trained with self-supervised learning on the noisy target dataset (using SimCLR or Barlow Twins) and then fine-tuned with supervised learning on the same noisy data. Across CIFAR-10/100 with synthetic and real-world noise, SSL pre-training improves both conventional accuracy and label-error detection (F1 and Balanced Accuracy), with larger benefits as noise increases; it also remains competitive with ImageNet pre-training at low noise and outperforms it at high noise. The findings suggest that domain-aligned SSL pre-training yields robust representations that mitigate memorization of corrupted labels and persist under extended supervised training, offering a simple, scalable method for robust learning in noisy-label settings.

Abstract

Training deep networks with noisy labels leads to poor generalization and degraded accuracy due to overfitting to label noise. Existing approaches for learning with noisy labels often rely on the availability of a clean subset of data. By pre-training a feature extractor backbone without labels using self-supervised learning (SSL), followed by standard supervised training on the noisy dataset, we can train a more noise robust model without requiring a subset with clean labels. We evaluate the use of SimCLR and Barlow~Twins as SSL methods on CIFAR-10 and CIFAR-100 under synthetic and real world noise. Across all noise rates, self-supervised pre-training consistently improves classification accuracy and enhances downstream label-error detection (F1 and Balanced Accuracy). The performance gap widens as the noise rate increases, demonstrating improved robustness. Notably, our approach achieves comparable results to ImageNet pre-trained models at low noise levels, while substantially outperforming them under high noise conditions.

Pre-train to Gain: Robust Learning Without Clean Labels

TL;DR

Abstract

Pre-train to Gain: Robust Learning Without Clean Labels

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)