Morpho-Genomic Deep Learning for Ovarian Cancer Subtype and Gene Mutation Prediction from Histopathology
Gabriela Fernandes
TL;DR
Ovarian cancer's heterogeneity complicates diagnosis and treatment; this paper addresses this by predicting subtypes and gene mutations from H&E images. It introduces a morpho-genomic pipeline that fuses a ResNet-50 encoder and a Vision Transformer with handcrafted nuclear morphometry, trained on about 45,000 patches from TCGA-OV and Kaggle. The model achieves 84.2% subtype accuracy and AUCs of 0.82 for TP53, 0.76 for BRCA1, and 0.73 for ARID1A; morphometry alone fails to predict TP53, underscoring the value of deep features. Feature importance links nuclear solidity and eccentricity to TP53 and shows other morphometric cues for BRCA1/ARID1A, with Grad-CAM supporting interpretability. These results point to a cost-effective path for precision histopathology and molecular prescreening, enabling faster triage and informed sequencing decisions.
Abstract
Ovarian cancer remains one of the most lethal gynecological malignancies, largely due to late diagnosis and extensive heterogeneity across subtypes. Current diagnostic methods are limited in their ability to reveal underlying genomic variations essential for precision oncology. This study introduces a novel hybrid deep learning pipeline that integrates quantitative nuclear morphometry with deep convolutional image features to perform ovarian cancer subtype classification and gene mutation inference directly from Hematoxylin and Eosin (H&E) histopathological images. Using $\sim45,000$ image patches sourced from The Cancer Genome Atlas (TCGA) and public datasets, a fusion model combining a ResNet-50 Convolutional Neural Network (CNN) encoder and a Vision Transformer (ViT) was developed. This model successfully captured both local morphological texture and global tissue context. The pipeline achieved a robust overall subtype classification accuracy of $84.2\%$ (Macro AUC of $0.87 \pm 0.03$). Crucially, the model demonstrated the capacity for gene mutation inference with moderate-to-high accuracy: $AUC_{TP53} = 0.82 \pm 0.02$, $AUC_{BRCA1} = 0.76 \pm 0.04$, and $AUC_{ARID1A} = 0.73 \pm 0.05$. Feature importance analysis established direct quantitative links, revealing that nuclear solidity and eccentricity were the dominant predictors for TP53 mutation. These findings validate that quantifiable histological phenotypes encode measurable genomic signals, paving the way for cost-effective, precision histopathology in ovarian cancer triage and diagnosis.
