Table of Contents
Fetching ...

EvoPS: Evolutionary Patch Selection for Whole Slide Image Analysis in Computational Pathology

Saya Hashemian, Azam Asilian Bidgoli

TL;DR

EvoPS tackles the scalability challenge of whole-slide image analysis by recasting patch selection as a multi-objective optimization that minimizes the number of training patches while preserving diagnostic accuracy. It employs an evolutionary algorithm to generate a Pareto front of optimal patch subsets, evaluated via a k-NN-based, weighted F1-score on a validation set and then tested on held-out data. The approach is validated across four TCGA cancer cohorts using five diverse backbones, achieving large data reductions (often >85%) with maintained or improved F1 performance, though backbone sensitivity can occur. By producing explicit trade-off curves, EvoPS enables flexible, efficient, and interpretable WSI representations suitable for rapid screening or detailed diagnostics in computational pathology.

Abstract

In computational pathology, the gigapixel scale of Whole-Slide Images (WSIs) necessitates their division into thousands of smaller patches. Analyzing these high-dimensional patch embeddings is computationally expensive and risks diluting key diagnostic signals with many uninformative patches. Existing patch selection methods often rely on random sampling or simple clustering heuristics and typically fail to explicitly manage the crucial trade-off between the number of selected patches and the accuracy of the resulting slide representation. To address this gap, we propose EvoPS (Evolutionary Patch Selection), a novel framework that formulates patch selection as a multi-objective optimization problem and leverages an evolutionary search to simultaneously minimize the number of selected patch embeddings and maximize the performance of a downstream similarity search task, generating a Pareto front of optimal trade-off solutions. We validated our framework across four major cancer cohorts from The Cancer Genome Atlas (TCGA) using five pretrained deep learning models to generate patch embeddings, including both supervised CNNs and large self-supervised foundation models. The results demonstrate that EvoPS can reduce the required number of training patch embeddings by over 90% while consistently maintaining or even improving the final classification F1-score compared to a baseline that uses all available patches' embeddings selected through a standard extraction pipeline. The EvoPS framework provides a robust and principled method for creating efficient, accurate, and interpretable WSI representations, empowering users to select an optimal balance between computational cost and diagnostic performance.

EvoPS: Evolutionary Patch Selection for Whole Slide Image Analysis in Computational Pathology

TL;DR

EvoPS tackles the scalability challenge of whole-slide image analysis by recasting patch selection as a multi-objective optimization that minimizes the number of training patches while preserving diagnostic accuracy. It employs an evolutionary algorithm to generate a Pareto front of optimal patch subsets, evaluated via a k-NN-based, weighted F1-score on a validation set and then tested on held-out data. The approach is validated across four TCGA cancer cohorts using five diverse backbones, achieving large data reductions (often >85%) with maintained or improved F1 performance, though backbone sensitivity can occur. By producing explicit trade-off curves, EvoPS enables flexible, efficient, and interpretable WSI representations suitable for rapid screening or detailed diagnostics in computational pathology.

Abstract

In computational pathology, the gigapixel scale of Whole-Slide Images (WSIs) necessitates their division into thousands of smaller patches. Analyzing these high-dimensional patch embeddings is computationally expensive and risks diluting key diagnostic signals with many uninformative patches. Existing patch selection methods often rely on random sampling or simple clustering heuristics and typically fail to explicitly manage the crucial trade-off between the number of selected patches and the accuracy of the resulting slide representation. To address this gap, we propose EvoPS (Evolutionary Patch Selection), a novel framework that formulates patch selection as a multi-objective optimization problem and leverages an evolutionary search to simultaneously minimize the number of selected patch embeddings and maximize the performance of a downstream similarity search task, generating a Pareto front of optimal trade-off solutions. We validated our framework across four major cancer cohorts from The Cancer Genome Atlas (TCGA) using five pretrained deep learning models to generate patch embeddings, including both supervised CNNs and large self-supervised foundation models. The results demonstrate that EvoPS can reduce the required number of training patch embeddings by over 90% while consistently maintaining or even improving the final classification F1-score compared to a baseline that uses all available patches' embeddings selected through a standard extraction pipeline. The EvoPS framework provides a robust and principled method for creating efficient, accurate, and interpretable WSI representations, empowering users to select an optimal balance between computational cost and diagnostic performance.

Paper Structure

This paper contains 12 sections, 1 equation, 6 figures, 3 tables, 2 algorithms.

Figures (6)

  • Figure 1: An overview of the three-stage EvoPS pipeline, consisting of (A) WSI preprocessing and feature extraction, (B) the EvoPS framework for multi-objective patch selection, and (C) the generation of a final Pareto front of optimal patch-subset solutions.
  • Figure 2: Conceptual illustration of an individual solution as a binary vector. Each segment (WSI$_1$, WSI$_2$, etc.) represents all patches from a single WSI. A '1' indicates a selected patch, while a '0' indicates a discarded patch.
  • Figure 3: A Comprehensive performance comparison between the Baseline and EvoPS across all five feature backbones. The top row of plots details the number of training patches, illustrating the significant reduction achieved by EvoPS. The bottom row presents the corresponding Test $\mathrm{F}_{1}$-scores for each cohort. Each column is dedicated to a specific deep learning model, from left to right: KimiaNet, DenseNet, Phikon V2, Virchow2, and UNI2H.
  • Figure 4: Confusion matrix comparison for the UNI2H model. The performance of the baseline model is detailed in the top row, whereas the performance of the EvoPS framework is illustrated in the bottom row. The results are categorized by tissue type, with each column representing Gastro, Liver, Mesenchymal, and Pulmonary, respectively.
  • Figure 5: t-SNE visualization comparing the patch embedding distributions for the Baseline versus EvoPS frameworks. The left column displays the feature space using all available patches (Baseline), while the right column displays the feature space using only the patch subset selected by EvoPS. Each row corresponds to a different cancer cohort, from top to bottom: Gastro, Pulmonary, Liver, and Mesenchymal, respectively.
  • ...and 1 more figures