Table of Contents
Fetching ...

Contrastive-Based Deep Embeddings for Label Noise-Resilient Histopathology Image Classification

Lucas Dedieu, Nicolas Nerrienet, Adrien Nivaggioli, Clara Simmat, Marceau Clavel, Arnaud Gauthier, Stéphane Sockeel, Rémy Peyret

TL;DR

This work tackles label noise in histopathology image classification by leveraging contrastive self-supervised learning to build deep embeddings from foundation models. A frozen, contrastive-pretrained backbone is used to generate embeddings on which a linear head is trained, enabling fast, dataset-specific adaptation while promoting robustness to both uniform and asymmetric label noise. Across six public datasets, contrastive embeddings consistently outperform non-contrastive embeddings and several image-based baselines, with k-NN and t-SNE analyses supporting a representation-level resilience to noisy labels. The study highlights the practical value of contrastive SSL for robust histopathology classification and provides public code to enable replication and further exploration.

Abstract

Recent advancements in deep learning have proven highly effective in medical image classification, notably within histopathology. However, noisy labels represent a critical challenge in histopathology image classification, where accurate annotations are vital for training robust deep learning models. Indeed, deep neural networks can easily overfit label noise, leading to severe degradations in model performance. While numerous public pathology foundation models have emerged recently, none have evaluated their resilience to label noise. Through thorough empirical analyses across multiple datasets, we exhibit the label noise resilience property of embeddings extracted from foundation models trained in a self-supervised contrastive manner. We demonstrate that training with such embeddings substantially enhances label noise robustness when compared to non-contrastive-based ones as well as commonly used noise-resilient methods. Our results unequivocally underline the superiority of contrastive learning in effectively mitigating the label noise challenge. Code is publicly available at https://github.com/LucasDedieu/NoiseResilientHistopathology.

Contrastive-Based Deep Embeddings for Label Noise-Resilient Histopathology Image Classification

TL;DR

This work tackles label noise in histopathology image classification by leveraging contrastive self-supervised learning to build deep embeddings from foundation models. A frozen, contrastive-pretrained backbone is used to generate embeddings on which a linear head is trained, enabling fast, dataset-specific adaptation while promoting robustness to both uniform and asymmetric label noise. Across six public datasets, contrastive embeddings consistently outperform non-contrastive embeddings and several image-based baselines, with k-NN and t-SNE analyses supporting a representation-level resilience to noisy labels. The study highlights the practical value of contrastive SSL for robust histopathology classification and provides public code to enable replication and further exploration.

Abstract

Recent advancements in deep learning have proven highly effective in medical image classification, notably within histopathology. However, noisy labels represent a critical challenge in histopathology image classification, where accurate annotations are vital for training robust deep learning models. Indeed, deep neural networks can easily overfit label noise, leading to severe degradations in model performance. While numerous public pathology foundation models have emerged recently, none have evaluated their resilience to label noise. Through thorough empirical analyses across multiple datasets, we exhibit the label noise resilience property of embeddings extracted from foundation models trained in a self-supervised contrastive manner. We demonstrate that training with such embeddings substantially enhances label noise robustness when compared to non-contrastive-based ones as well as commonly used noise-resilient methods. Our results unequivocally underline the superiority of contrastive learning in effectively mitigating the label noise challenge. Code is publicly available at https://github.com/LucasDedieu/NoiseResilientHistopathology.
Paper Structure (19 sections, 1 equation, 3 figures, 6 tables)

This paper contains 19 sections, 1 equation, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Average test accuracies (4 runs) of linear classifiers trained with deep embedding over different label noise ratios. Shaded areas represent standard deviation. iBOT and MoCo backbones (dashed curves) are pre-trained on ImageNet.
  • Figure 2: Average test accuracies (4 runs) of k-NN classifiers (k=5) trained on 10% of train datasets over different label noise ratios. Shaded areas represent standard deviation.
  • Figure 3: t-SNE representations of NCT-CRC-HE-100k deep embeddings extracted from various histopathology foundation models.

Theorems & Definitions (1)

  • remark 1