Multi-Resolution Histopathology Patch Graphs for Ovarian Cancer Subtyping
Jack Breen, Katie Allen, Kieran Zucker, Nicolas M. Orsi, Nishant Ravikumar
TL;DR
The study tackles the challenge of ovarian cancer subtyping by moving beyond single-resolution patch analysis to a multi-resolution patch graph approach that captures cross-magnification spatial context. It employs a graph network with GATv2 layers and SAGPool pooling, using features from the UNI histopathology foundation model, and evaluates seven model variants across five-fold cross-validation, a hold-out test set, and an external Transcanadian dataset. The best externally validated model (10x+20x magnifications with UNI features) achieved a balanced accuracy of 99.0% and AUROC of 1.000, marking the highest reported performance for this task, though performance gains were not uniform across all validation schemes and depend on the feature encoder. While promising for clinical translation as a rapid second opinion, the work emphasizes the need for broader validations, uncertainty estimation, and practicality considerations due to data size and computational demands.
Abstract
Computer vision models are increasingly capable of classifying ovarian epithelial cancer subtypes, but they differ from pathologists by processing small tissue patches at a single resolution. Multi-resolution graph models leverage the spatial relationships of patches at multiple magnifications, learning the context for each patch. In this study, we conduct the most thorough validation of a graph model for ovarian cancer subtyping to date. Seven models were tuned and trained using five-fold cross-validation on a set of 1864 whole slide images (WSIs) from 434 patients treated at Leeds Teaching Hospitals NHS Trust. The cross-validation models were ensembled and evaluated using a balanced hold-out test set of 100 WSIs from 30 patients, and an external validation set of 80 WSIs from 80 patients in the Transcanadian Study. The best-performing model, a graph model using 10x+20x magnification data, gave balanced accuracies of 73%, 88%, and 99% in cross-validation, hold-out testing, and external validation, respectively. However, this only exceeded the performance of attention-based multiple instance learning in external validation, with a 93% balanced accuracy. Graph models benefitted greatly from using the UNI foundation model rather than an ImageNet-pretrained ResNet50 for feature extraction, with this having a much greater effect on performance than changing the subsequent classification approach. The accuracy of the combined foundation model and multi-resolution graph network offers a step towards the clinical applicability of these models, with a new highest-reported performance for this task, though further validations are still required to ensure the robustness and usability of the models.
