EvoPS: Evolutionary Patch Selection for Whole Slide Image Analysis in Computational Pathology
Saya Hashemian, Azam Asilian Bidgoli
TL;DR
EvoPS tackles the scalability challenge of whole-slide image analysis by recasting patch selection as a multi-objective optimization that minimizes the number of training patches while preserving diagnostic accuracy. It employs an evolutionary algorithm to generate a Pareto front of optimal patch subsets, evaluated via a k-NN-based, weighted F1-score on a validation set and then tested on held-out data. The approach is validated across four TCGA cancer cohorts using five diverse backbones, achieving large data reductions (often >85%) with maintained or improved F1 performance, though backbone sensitivity can occur. By producing explicit trade-off curves, EvoPS enables flexible, efficient, and interpretable WSI representations suitable for rapid screening or detailed diagnostics in computational pathology.
Abstract
In computational pathology, the gigapixel scale of Whole-Slide Images (WSIs) necessitates their division into thousands of smaller patches. Analyzing these high-dimensional patch embeddings is computationally expensive and risks diluting key diagnostic signals with many uninformative patches. Existing patch selection methods often rely on random sampling or simple clustering heuristics and typically fail to explicitly manage the crucial trade-off between the number of selected patches and the accuracy of the resulting slide representation. To address this gap, we propose EvoPS (Evolutionary Patch Selection), a novel framework that formulates patch selection as a multi-objective optimization problem and leverages an evolutionary search to simultaneously minimize the number of selected patch embeddings and maximize the performance of a downstream similarity search task, generating a Pareto front of optimal trade-off solutions. We validated our framework across four major cancer cohorts from The Cancer Genome Atlas (TCGA) using five pretrained deep learning models to generate patch embeddings, including both supervised CNNs and large self-supervised foundation models. The results demonstrate that EvoPS can reduce the required number of training patch embeddings by over 90% while consistently maintaining or even improving the final classification F1-score compared to a baseline that uses all available patches' embeddings selected through a standard extraction pipeline. The EvoPS framework provides a robust and principled method for creating efficient, accurate, and interpretable WSI representations, empowering users to select an optimal balance between computational cost and diagnostic performance.
