Learning from Noisy Labels with Contrastive Co-Transformer

Yan Han; Soumava Kumar Roy; Mehrtash Harandi; Lars Petersson

Learning from Noisy Labels with Contrastive Co-Transformer

Yan Han, Soumava Kumar Roy, Mehrtash Harandi, Lars Petersson

TL;DR

This work addresses learning from noisy image-labels by integrating a contrastive loss into a Co-Training framework with two transformer encoders (CCT). It leverages all mini-batch samples through an unsupervised contrastive objective $L_{con}$ alongside supervised $L_{ce}$, forming total losses $L_1 = L_{ce}^1 + \lambda L_{con}$ and $L_2 = L_{ce}^2 + \lambda L_{con}$ with $\lambda = 0.0001$. The approach demonstrates strong empirical performance across six datasets, including Clothing1M, often surpassing state-of-the-art noisy-label methods while using fewer parameters and less computation. This indicates transformers can provide improved robustness to label noise when combined with contrastive learning in a Co-Training setup, with practical impact for real-world noisy data scenarios.

Abstract

Deep learning with noisy labels is an interesting challenge in weakly supervised learning. Despite their significant learning capacity, CNNs have a tendency to overfit in the presence of samples with noisy labels. Alleviating this issue, the well known Co-Training framework is used as a fundamental basis for our work. In this paper, we introduce a Contrastive Co-Transformer framework, which is simple and fast, yet able to improve the performance by a large margin compared to the state-of-the-art approaches. We argue the robustness of transformers when dealing with label noise. Our Contrastive Co-Transformer approach is able to utilize all samples in the dataset, irrespective of whether they are clean or noisy. Transformers are trained by a combination of contrastive loss and classification loss. Extensive experimental results on corrupted data from six standard benchmark datasets including Clothing1M, demonstrate that our Contrastive Co-Transformer is superior to existing state-of-the-art methods.

Learning from Noisy Labels with Contrastive Co-Transformer

TL;DR

Abstract

Learning from Noisy Labels with Contrastive Co-Transformer

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)