Table of Contents
Fetching ...

BiTro: Bidirectional Transfer Learning Enhances Bulk and Spatial Transcriptomics Prediction in Cancer Pathological Images

Jingkun Yu, Guangkai Shang, Changtao Li, Xun Gong, Tianrui Li, Yazhou He, Zhipeng Luo

Abstract

Cancer pathological analysis requires modeling tumor heterogeneity across multiple modalities, primarily through transcriptomics and whole slide imaging (WSI), along with their spatial relations. On one hand, bulk transcriptomics and WSI images are largely available but lack spatial mapping; on the other hand, spatial transcriptomics (ST) data can offer high spatial resolution, yet facing challenges of high cost, low sequencing depth, and limited sample sizes. Therefore, the data foundation of either side is flawed and has its limit in accurately finding the mapping between the two modalities. To this end, we propose BiTro, a bidirectional transfer learning framework that can enhance bulk and spatial transcriptomics prediction from pathological images. Our contributions are twofold. First, we design a universal and transferable model architecture that works for both bulk+WSI and ST data. A major highlight is that we model WSI images on the cellular level to better capture cells' visual features, morphological phenotypes, and their spatial relations; to map cells' features to their transcriptomics measured in bulk or ST, we adopt multiple instance learning. Second, by using LoRA, our model can be efficiently transferred between bulk and ST data to exploit their complementary information. To test our framework, we conducted comprehensive experiments on five cancer datasets. Results demonstrate that 1) our base model can achieve better or competitive performance compared to existing models on bulk or spatial transcriptomics prediction, and 2) transfer learning can further improve the base model's performance.

BiTro: Bidirectional Transfer Learning Enhances Bulk and Spatial Transcriptomics Prediction in Cancer Pathological Images

Abstract

Cancer pathological analysis requires modeling tumor heterogeneity across multiple modalities, primarily through transcriptomics and whole slide imaging (WSI), along with their spatial relations. On one hand, bulk transcriptomics and WSI images are largely available but lack spatial mapping; on the other hand, spatial transcriptomics (ST) data can offer high spatial resolution, yet facing challenges of high cost, low sequencing depth, and limited sample sizes. Therefore, the data foundation of either side is flawed and has its limit in accurately finding the mapping between the two modalities. To this end, we propose BiTro, a bidirectional transfer learning framework that can enhance bulk and spatial transcriptomics prediction from pathological images. Our contributions are twofold. First, we design a universal and transferable model architecture that works for both bulk+WSI and ST data. A major highlight is that we model WSI images on the cellular level to better capture cells' visual features, morphological phenotypes, and their spatial relations; to map cells' features to their transcriptomics measured in bulk or ST, we adopt multiple instance learning. Second, by using LoRA, our model can be efficiently transferred between bulk and ST data to exploit their complementary information. To test our framework, we conducted comprehensive experiments on five cancer datasets. Results demonstrate that 1) our base model can achieve better or competitive performance compared to existing models on bulk or spatial transcriptomics prediction, and 2) transfer learning can further improve the base model's performance.
Paper Structure (42 sections, 21 equations, 4 figures, 8 tables)

This paper contains 42 sections, 21 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: Overview of the BiTro framework. There are four sequential phases: (a) WSI preprocessing: segment a WSI into cells by HoverNet or CellViT; (b) Cell feature extraction: extract cells' visual features via Dinov3, capture cell phenotypes by K-means clustering, and add spatial coordinates; (c) Spatial enhancement: enhance cell features with both local and global spatial relations; (d) MIL and Bidirectional transfer learning: map cellular features to their bulk or spot-level transcriptomics by multiple instance learning. The learning process is enhanced by both levels of data via bidirectional transfer learning.
  • Figure 2: Model architecture. Inputs are cellular visual features $\mathbf{H}$ and their local spatial graph $\mathcal{G}$. The GAT module and the Transformer encoder are to enhance $\mathbf{H}$ with spatial relations both locally and globally. The enhanced features $\mathbf{H}_{\mathrm{cell}}$ are used to predict gene expression $\hat{\mathbf{y}}$ by multiple instance pooling.
  • Figure 3: Gene expression visualization on the SPA148 sample of BRCA for gene in log1p space.
  • Figure 4: Visualization of BiTro's inference on super-resolved spatial gene expression for gene EPCAM in the COAD sample TENX139 at both spot and cellular levels in the log1p space.