Leveraging Medical Foundation Model Features in Graph Neural Network-Based Retrieval of Breast Histopathology Images
Nematollah Saeidi, Hossein Karshenas, Bijan Shoushtarian, Sepideh Hatamikia, Ramona Woitek, Amirreza Mahbod
TL;DR
This work tackles the challenge of breast histopathology image retrieval by addressing histology variability with a graph-based approach that leverages medical foundation-model features. It introduces an attention-based adversarially regularized variational graph autoencoder (A-ARVGAE) and builds a kNN graph using FLANN to learn robust image embeddings from medical foundation-models such as UNI, BioMedCLIP, and CCL. On BreakHis and BACH, foundation-model features—especially UNI—outperform CNN baselines, with reported average retrieval metrics around mAP($5$) and mMV($5$) in the high 0.9s range; A-ARVGAE further improves over other GNN variants by notable margins. The proposed pipeline has potential to assist pathologists by quickly surfacing visually or clinically similar cases, and the authors suggest future directions including transformer-based GNNs, alternative ANN libraries, and exploration of additional foundation models to further boost performance.
Abstract
Breast cancer is the most common cancer type in women worldwide. Early detection and appropriate treatment can significantly reduce its impact. While histopathology examinations play a vital role in rapid and accurate diagnosis, they often require experienced medical experts for proper recognition and cancer grading. Automated image retrieval systems have the potential to assist pathologists in identifying cancerous tissues, thereby accelerating the diagnostic process. Nevertheless, proposing an accurate image retrieval model is challenging due to considerable variability among the tissue and cell patterns in histological images. In this work, we leverage the features from foundation models in a novel attention-based adversarially regularized variational graph autoencoder model for breast histological image retrieval. Our results confirm the superior performance of models trained with foundation model features compared to those using pre-trained convolutional neural networks (up to 7.7% and 15.5% for mAP and mMV, respectively), with the pre-trained general-purpose self-supervised model for computational pathology (UNI) delivering the best overall performance. By evaluating two publicly available histology image datasets of breast cancer, our top-performing model, trained with UNI features, achieved average mAP/mMV scores of 96.7%/91.5% and 97.6%/94.2% for the BreakHis and BACH datasets, respectively. Our proposed retrieval model has the potential to be used in clinical settings to enhance diagnostic performance and ultimately benefit patients.
