Table of Contents
Fetching ...

"No negatives needed": weakly-supervised regression for interpretable tumor detection in whole-slide histopathology images

Marina D'Amato, Jeroen van der Laak, Francesco Ciompi

TL;DR

This work rethinks weakly-supervised tumor detection in whole-slide histopathology by reframing the problem as regression to predict tumor percentage rather than binary presence. It develops a MIL-based regression framework with four instance-based variants (MeanPool, ABMIL, CLAM, WeSEG), analyzes robustness to synthetic target noise, and introduces a fifth-root target amplification to improve detection of small lesions. Across five diverse datasets, the approach achieves strong correlations between predicted and true tumor percentages and competitive tumor-detection performance, with ABMIL and CLAM particularly benefiting from amplification and showing robust performance under noise. The study provides interpretable insights via instance-level logits and attention heatmaps, discusses limitations of attention in regression tasks, and offers practical implications for scalable tumor detection without requiring negative examples or precise pixel-level annotations.

Abstract

Accurate tumor detection in digital pathology whole-slide images (WSIs) is crucial for cancer diagnosis and treatment planning. Multiple Instance Learning (MIL) has emerged as a widely used approach for weakly-supervised tumor detection with large-scale data without the need for manual annotations. However, traditional MIL methods often depend on classification tasks that require tumor-free cases as negative examples, which are challenging to obtain in real-world clinical workflows, especially for surgical resection specimens. We address this limitation by reformulating tumor detection as a regression task, estimating tumor percentages from WSIs, a clinically available target across multiple cancer types. In this paper, we provide an analysis of the proposed weakly-supervised regression framework by applying it to multiple organs, specimen types and clinical scenarios. We characterize the robustness of our framework to tumor percentage as a noisy regression target, and introduce a novel concept of amplification technique to improve tumor detection sensitivity when learning from small tumor regions. Finally, we provide interpretable insights into the model's predictions by analyzing visual attention and logit maps. Our code is available at https://github.com/DIAGNijmegen/tumor-percentage-mil-regression.

"No negatives needed": weakly-supervised regression for interpretable tumor detection in whole-slide histopathology images

TL;DR

This work rethinks weakly-supervised tumor detection in whole-slide histopathology by reframing the problem as regression to predict tumor percentage rather than binary presence. It develops a MIL-based regression framework with four instance-based variants (MeanPool, ABMIL, CLAM, WeSEG), analyzes robustness to synthetic target noise, and introduces a fifth-root target amplification to improve detection of small lesions. Across five diverse datasets, the approach achieves strong correlations between predicted and true tumor percentages and competitive tumor-detection performance, with ABMIL and CLAM particularly benefiting from amplification and showing robust performance under noise. The study provides interpretable insights via instance-level logits and attention heatmaps, discusses limitations of attention in regression tasks, and offers practical implications for scalable tumor detection without requiring negative examples or precise pixel-level annotations.

Abstract

Accurate tumor detection in digital pathology whole-slide images (WSIs) is crucial for cancer diagnosis and treatment planning. Multiple Instance Learning (MIL) has emerged as a widely used approach for weakly-supervised tumor detection with large-scale data without the need for manual annotations. However, traditional MIL methods often depend on classification tasks that require tumor-free cases as negative examples, which are challenging to obtain in real-world clinical workflows, especially for surgical resection specimens. We address this limitation by reformulating tumor detection as a regression task, estimating tumor percentages from WSIs, a clinically available target across multiple cancer types. In this paper, we provide an analysis of the proposed weakly-supervised regression framework by applying it to multiple organs, specimen types and clinical scenarios. We characterize the robustness of our framework to tumor percentage as a noisy regression target, and introduce a novel concept of amplification technique to improve tumor detection sensitivity when learning from small tumor regions. Finally, we provide interpretable insights into the model's predictions by analyzing visual attention and logit maps. Our code is available at https://github.com/DIAGNijmegen/tumor-percentage-mil-regression.

Paper Structure

This paper contains 29 sections, 10 figures, 1 table.

Figures (10)

  • Figure 1: Visual example of a procedure to extract an approximate tumor percentage area from a slide used in molecular diagnostics procedures. From a coarse annotation of the tumor area, often provided with a pen marker, we can derive the tumor percentage via simple image analysis steps: 1) segmentation of foreground tissue versus background, 2) identification of the pen marker and area filling, 3) intersection of tissue and marker area, 4) calculation of the tumor percentage.
  • Figure 2: Examples of WSIs paired with manual annotations (when available) or tumor segmentation masks, illustrating the computed tumor percentages. The first column presents the input WSIs, the second column displays either manual annotations (for CAM16) or the raw output of the segmentation algorithm, and the third column showcases the binarized maps highlighting the tumor regions.
  • Figure 3: Overview of the instance-based multiple instance learning (MIL) framework. The instance-based approach processes individual patches from the input image independently, making predictions at the instance level before aggregating them.
  • Figure 4: The first panel illustrates the fifth root transformation, with annotated points demonstrating the effect of amplification on selected values, particularly enhancing lower percentages. The other five panels display the distribution of tumor percentages across various cohorts, shown only for tumorous slides (excluding negative cases), before and after the amplification technique.
  • Figure 5: Comparison of bar plots and ROC curves across methods and datasets. (a) Pearson correlation coefficients for different methods across the five datasets, indicating the strength of linear relationships between predicted and true tumor percentages. (b) Spearman correlation coefficients across the five datasets, showcasing the rank-based correlation between predictions and ground truth. (c-e) Receiver Operating Characteristic (ROC) curves for tumor detection with AUC scores for CAM16 (c), ExaMode (d), and COBRA (e), illustrating the trade-off between true positive and false positive rates. The shaded areas represent the confidence intervals over the 5-fold cross-validation, and the mean AUC values with standard deviations are reported for each dataset. These results are based on models trained using true tumor percentages.
  • ...and 5 more figures