Table of Contents
Fetching ...

Hard Negative Sample Mining for Whole Slide Image Classification

Wentao Huang, Xiaoling Hu, Shahira Abousamra, Prateek Prasanna, Chao Chen

TL;DR

The paper tackles weakly supervised whole slide image (WSI) classification where only slide-level labels are available, and MIL-based approaches must infer instance-level information. It introduces hard negative mining during fine-tuning via supervised contrastive learning and a patch-wise multiple instance ranking loss defined as $\mathcal{L}_{MIRank} = \max(0, 1 - \frac{1}{K} \sum_{top_K} \hat{s}_{i}^p + \frac{1}{K} \sum_{top_K} \hat{s}_{i}^n)$ to separately optimize top positive and hard negative patches, integrated in an iterative MIL training loop. Experiments on Camelyon16 and TCGA-LUAD demonstrate state-of-the-art performance and notable reductions in training time when using a small fraction of hard negatives (e.g., 5%). The combined approach improves both instance-level and slide-level predictions, offering practical benefits for MIL-based WSI analysis.

Abstract

Weakly supervised whole slide image (WSI) classification is challenging due to the lack of patch-level labels and high computational costs. State-of-the-art methods use self-supervised patch-wise feature representations for multiple instance learning (MIL). Recently, methods have been proposed to fine-tune the feature representation on the downstream task using pseudo labeling, but mostly focusing on selecting high-quality positive patches. In this paper, we propose to mine hard negative samples during fine-tuning. This allows us to obtain better feature representations and reduce the training cost. Furthermore, we propose a novel patch-wise ranking loss in MIL to better exploit these hard negative samples. Experiments on two public datasets demonstrate the efficacy of these proposed ideas. Our codes are available at https://github.com/winston52/HNM-WSI

Hard Negative Sample Mining for Whole Slide Image Classification

TL;DR

The paper tackles weakly supervised whole slide image (WSI) classification where only slide-level labels are available, and MIL-based approaches must infer instance-level information. It introduces hard negative mining during fine-tuning via supervised contrastive learning and a patch-wise multiple instance ranking loss defined as to separately optimize top positive and hard negative patches, integrated in an iterative MIL training loop. Experiments on Camelyon16 and TCGA-LUAD demonstrate state-of-the-art performance and notable reductions in training time when using a small fraction of hard negatives (e.g., 5%). The combined approach improves both instance-level and slide-level predictions, offering practical benefits for MIL-based WSI analysis.

Abstract

Weakly supervised whole slide image (WSI) classification is challenging due to the lack of patch-level labels and high computational costs. State-of-the-art methods use self-supervised patch-wise feature representations for multiple instance learning (MIL). Recently, methods have been proposed to fine-tune the feature representation on the downstream task using pseudo labeling, but mostly focusing on selecting high-quality positive patches. In this paper, we propose to mine hard negative samples during fine-tuning. This allows us to obtain better feature representations and reduce the training cost. Furthermore, we propose a novel patch-wise ranking loss in MIL to better exploit these hard negative samples. Experiments on two public datasets demonstrate the efficacy of these proposed ideas. Our codes are available at https://github.com/winston52/HNM-WSI
Paper Structure (5 sections, 5 equations, 4 figures, 5 tables)

This paper contains 5 sections, 5 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Feature representation tuning. Previous methods liu2023multiplequ2023rethinking perform contrastive learning between top-ranked positive patches and all negative patches. Our method (highlighted in red) selects hard negatives for supervised contrastive learning.
  • Figure 2: Overview of our hard negative sample mining framework: WSIs are cut into patches. The encoder generates instance-level features which are aggregated into bag-level features and a pseudo label is assigned to each instance. The multiple instance ranking loss is employed to enhance the accuracy of the pseudo labels. Finally, negative and positive patches are selected based on enhanced pseudo labels to fine-tune the encoder and the process is repeated iteratively.
  • Figure 3: Visualization of instance prediction probabilities on the Camelyon16 dataset. Patches with probabilities below 0.3 are rendered transparent.
  • Figure 4: Magnified version of Figure 3. The first column displays the instance ground truth label. The second, third, and fourth columns visualize the instance prediction probabilities generated by DSMIL, ItS2CLR, and our method, respectively. Patches with threshold probabilities below 0.3 are rendered transparent.