Table of Contents
Fetching ...

Cross-Domain Validation of a Resection-Trained Self-Supervised Model on Multicentre Mesothelioma Biopsies

Farzaneh Seyedshahi, Francesca Damiola, Sylvie Lantuejoul, Ke Yuan, John Le Quesne

TL;DR

This study tests whether a self-supervised encoder trained on resection mesothelioma tissue can be transferred to small biopsy slides collected across multiple French centers. By deriving biopsy-specific histomorphological phenotype clusters via Leiden clustering and representing slides through compositional CLR features, the approach enables downstream tasks of epithelioid vs non-epithelioid classification and patient survival prediction using Cox modeling. Despite domain shifts in tissue type and staining, the method identifies robust biopsy-relevant phenotypes and achieves strong subtype classification (AUC ≈ 0.92) with reasonable survival prognostication (c-index ≈ 0.60–0.64), demonstrating practical clinical potential for AI-assisted mesothelioma diagnosis and prognosis on routine biopsies. The work provides a reproducible pipeline and supports cross-institutional validation, while acknowledging limitations like subtype rarity and the absence of multimodal data, suggesting future multimodal integration and broader prospective validation.

Abstract

Accurate subtype classification and outcome prediction in mesothelioma are essential for guiding therapy and patient care. Most computational pathology models are trained on large tissue images from resection specimens, limiting their use in real-world settings where small biopsies are common. We show that a self-supervised encoder trained on resection tissue can be applied to biopsy material, capturing meaningful morphological patterns. Using these patterns, the model can predict patient survival and classify tumor subtypes. This approach demonstrates the potential of AI-driven tools to support diagnosis and treatment planning in mesothelioma.

Cross-Domain Validation of a Resection-Trained Self-Supervised Model on Multicentre Mesothelioma Biopsies

TL;DR

This study tests whether a self-supervised encoder trained on resection mesothelioma tissue can be transferred to small biopsy slides collected across multiple French centers. By deriving biopsy-specific histomorphological phenotype clusters via Leiden clustering and representing slides through compositional CLR features, the approach enables downstream tasks of epithelioid vs non-epithelioid classification and patient survival prediction using Cox modeling. Despite domain shifts in tissue type and staining, the method identifies robust biopsy-relevant phenotypes and achieves strong subtype classification (AUC ≈ 0.92) with reasonable survival prognostication (c-index ≈ 0.60–0.64), demonstrating practical clinical potential for AI-assisted mesothelioma diagnosis and prognosis on routine biopsies. The work provides a reproducible pipeline and supports cross-institutional validation, while acknowledging limitations like subtype rarity and the absence of multimodal data, suggesting future multimodal integration and broader prospective validation.

Abstract

Accurate subtype classification and outcome prediction in mesothelioma are essential for guiding therapy and patient care. Most computational pathology models are trained on large tissue images from resection specimens, limiting their use in real-world settings where small biopsies are common. We show that a self-supervised encoder trained on resection tissue can be applied to biopsy material, capturing meaningful morphological patterns. Using these patterns, the model can predict patient survival and classify tumor subtypes. This approach demonstrates the potential of AI-driven tools to support diagnosis and treatment planning in mesothelioma.

Paper Structure

This paper contains 16 sections, 9 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: (a) Overview of the analysis pipeline. From a dataset of 538,026 tiles, features were extracted using a frozen, pretrained (frozen) encoder to obtain embeddings. A random subsample of 250,000 embeddings was clustered using the Leiden algorithm, yielding 53 clusters. (b) Each cluster exhibits a distinct and internally consistent morphological pattern, illustrated by representative tile sets. The accompanying bar plot displays the number of tiles assigned to each cluster, highlighting the distribution of clusters within one fold of the cross-validation scheme.
  • Figure 2: Downstream WSI/Patient-level Tasks Performance. (a) Survival analysis results across the full cohort. The Cox proportional hazards model identifies several cluster-derived features with significant prognostic value ($p < 0.01$). Four representative tiles are shown for both high-risk and low-risk features to illustrate their underlying morphology. (b) Performance of the epithelioid vs. non-epithelioid classifier. The model achieves robust 5-fold cross-validated ROC–AUC, F1, recall, and precision on both training and test sets. Representative tiles from the clusters with the most negative and positive odds ratio and statistically significant contributions ($p < 0.05$) are also displayed.