Table of Contents
Fetching ...

Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning

Zijun Long, Lipeng Zhuang, George Killick, Richard McCreadie, Gerardo Aragon Camarasa, Paul Henderson

TL;DR

The paper investigates how human-labelling errors uniquely affect supervised contrastive learning (SCL) and shows that most incorrect learning signals arise from false positives due to high visual similarity. It introduces SCL-RHE, a robust SCL objective that down-weights easy positives and uses positive-unlabeled inspired sampling to emphasize latent-class-consistent positives while mitigating negative mislabels, all without extra computational overhead. Empirical results across scratch and transfer-learning scenarios on benchmarks like CIFAR and ImageNet-1K demonstrate that SCL-RHE achieves state-of-the-art accuracy and robustness to realistic labeling noise, even when test labels are corrected. Overall, the work provides a practical and efficient solution for robust representation learning under real-world human-labelling noise with broad applicability to vision tasks.

Abstract

Human-annotated vision datasets inevitably contain a fraction of human mislabelled examples. While the detrimental effects of such mislabelling on supervised learning are well-researched, their influence on Supervised Contrastive Learning (SCL) remains largely unexplored. In this paper, we show that human-labelling errors not only differ significantly from synthetic label errors, but also pose unique challenges in SCL, different to those in traditional supervised learning methods. Specifically, our results indicate they adversely impact the learning process in the ~99% of cases when they occur as false positive samples. Existing noise-mitigating methods primarily focus on synthetic label errors and tackle the unrealistic setting of very high synthetic noise rates (40-80%), but they often underperform on common image datasets due to overfitting. To address this issue, we introduce a novel SCL objective with robustness to human-labelling errors, SCL-RHE. SCL-RHE is designed to mitigate the effects of real-world mislabelled examples, typically characterized by much lower noise rates (<5%). We demonstrate that SCL-RHE consistently outperforms state-of-the-art representation learning and noise-mitigating methods across various vision benchmarks, by offering improved resilience against human-labelling errors.

Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning

TL;DR

The paper investigates how human-labelling errors uniquely affect supervised contrastive learning (SCL) and shows that most incorrect learning signals arise from false positives due to high visual similarity. It introduces SCL-RHE, a robust SCL objective that down-weights easy positives and uses positive-unlabeled inspired sampling to emphasize latent-class-consistent positives while mitigating negative mislabels, all without extra computational overhead. Empirical results across scratch and transfer-learning scenarios on benchmarks like CIFAR and ImageNet-1K demonstrate that SCL-RHE achieves state-of-the-art accuracy and robustness to realistic labeling noise, even when test labels are corrected. Overall, the work provides a practical and efficient solution for robust representation learning under real-world human-labelling noise with broad applicability to vision tasks.

Abstract

Human-annotated vision datasets inevitably contain a fraction of human mislabelled examples. While the detrimental effects of such mislabelling on supervised learning are well-researched, their influence on Supervised Contrastive Learning (SCL) remains largely unexplored. In this paper, we show that human-labelling errors not only differ significantly from synthetic label errors, but also pose unique challenges in SCL, different to those in traditional supervised learning methods. Specifically, our results indicate they adversely impact the learning process in the ~99% of cases when they occur as false positive samples. Existing noise-mitigating methods primarily focus on synthetic label errors and tackle the unrealistic setting of very high synthetic noise rates (40-80%), but they often underperform on common image datasets due to overfitting. To address this issue, we introduce a novel SCL objective with robustness to human-labelling errors, SCL-RHE. SCL-RHE is designed to mitigate the effects of real-world mislabelled examples, typically characterized by much lower noise rates (<5%). We demonstrate that SCL-RHE consistently outperforms state-of-the-art representation learning and noise-mitigating methods across various vision benchmarks, by offering improved resilience against human-labelling errors.
Paper Structure (20 sections, 12 equations, 2 figures, 4 tables)

This paper contains 20 sections, 12 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Comparison between impacts of labelling errors on different learning approaches. AL represents 'Assigned Label' and LL represents 'Latent Label'. Those marked red in AL represent human-labelling errors. It is important to note that as long as a pair shares the same latent label, there are no adverse impacts on positive pairs. Similarly, if the latent labels differ, negative pairs remain unaffected.
  • Figure 2: Figures (a) and (c) display the log-scaled distribution of cosine similarities for various pair types, including true positive pairs, true negative pairs, and human-labelling errors, on the CIFAR-10 and ImageNet-1k datasets, respectively. Conversely, figures (b) and (d) present analogous data, focusing instead on synthetic label errors.