Table of Contents
Fetching ...

Accurate Spatial Gene Expression Prediction by integrating Multi-resolution features

Youngmin Chung, Ji Hun Ha, Kyeong Chan Im, Joo Sang Lee

TL;DR

TRIPLEX addresses the challenge of predicting spatial gene expression from WSIs by leveraging multi-resolution information from target spots, surrounding neighborhoods, and global tissue context. It uses three dedicated encoders and a fusion layer with a fusion-loss objective to integrate information efficiently, aided by the APEG positional encoding for irregular WSIs. Across three ST datasets and external Visium data, TRIPLEX achieves superior $MSE$, $MAE$, and $PCC$ metrics, with notable gains in highly predictive genes and robust generalization to unseen tissue types. The approach holds potential to improve cancer diagnostics by providing accurate, interpretable spatial gene expression predictions aligned with tumor annotations, while maintaining practical computational costs. The combination of cross-attention-based fusion, multi-resolution tokens, and thorough methodological rigor strengthens its applicability to clinical spatial omics analyses.

Abstract

Recent advancements in Spatial Transcriptomics (ST) technology have facilitated detailed gene expression analysis within tissue contexts. However, the high costs and methodological limitations of ST necessitate a more robust predictive model. In response, this paper introduces TRIPLEX, a novel deep learning framework designed to predict spatial gene expression from Whole Slide Images (WSIs). TRIPLEX uniquely harnesses multi-resolution features, capturing cellular morphology at individual spots, the local context around these spots, and the global tissue organization. By integrating these features through an effective fusion strategy, TRIPLEX achieves accurate gene expression prediction. Our comprehensive benchmark study, conducted on three public ST datasets and supplemented with Visium data from 10X Genomics, demonstrates that TRIPLEX outperforms current state-of-the-art models in Mean Squared Error (MSE), Mean Absolute Error (MAE), and Pearson Correlation Coefficient (PCC). The model's predictions align closely with ground truth gene expression profiles and tumor annotations, underscoring TRIPLEX's potential in advancing cancer diagnosis and treatment.

Accurate Spatial Gene Expression Prediction by integrating Multi-resolution features

TL;DR

TRIPLEX addresses the challenge of predicting spatial gene expression from WSIs by leveraging multi-resolution information from target spots, surrounding neighborhoods, and global tissue context. It uses three dedicated encoders and a fusion layer with a fusion-loss objective to integrate information efficiently, aided by the APEG positional encoding for irregular WSIs. Across three ST datasets and external Visium data, TRIPLEX achieves superior , , and metrics, with notable gains in highly predictive genes and robust generalization to unseen tissue types. The approach holds potential to improve cancer diagnostics by providing accurate, interpretable spatial gene expression predictions aligned with tumor annotations, while maintaining practical computational costs. The combination of cross-attention-based fusion, multi-resolution tokens, and thorough methodological rigor strengthens its applicability to clinical spatial omics analyses.

Abstract

Recent advancements in Spatial Transcriptomics (ST) technology have facilitated detailed gene expression analysis within tissue contexts. However, the high costs and methodological limitations of ST necessitate a more robust predictive model. In response, this paper introduces TRIPLEX, a novel deep learning framework designed to predict spatial gene expression from Whole Slide Images (WSIs). TRIPLEX uniquely harnesses multi-resolution features, capturing cellular morphology at individual spots, the local context around these spots, and the global tissue organization. By integrating these features through an effective fusion strategy, TRIPLEX achieves accurate gene expression prediction. Our comprehensive benchmark study, conducted on three public ST datasets and supplemented with Visium data from 10X Genomics, demonstrates that TRIPLEX outperforms current state-of-the-art models in Mean Squared Error (MSE), Mean Absolute Error (MAE), and Pearson Correlation Coefficient (PCC). The model's predictions align closely with ground truth gene expression profiles and tumor annotations, underscoring TRIPLEX's potential in advancing cancer diagnosis and treatment.
Paper Structure (33 sections, 16 equations, 13 figures, 14 tables)

This paper contains 33 sections, 16 equations, 13 figures, 14 tables.

Figures (13)

  • Figure 1: Schematic representation of the TRIPLEX. The global encoder processes the global view, while separate encoders handle the target spot image and neighbor view. A fusion layer, incorporated with fusion loss, facilitates the effective integration of these tokens to predict gene expression levels.
  • Figure 2: The visualization includes tumor region annotations by pathologists, ground truth for GNAS expression levels, and predicted GNAS expression levels from HisToGene, Hist2ST, ST-Net, EGN, BLEEP, and TRIPLEX, in samples from datasets BC1 and BC2. The PCC between the ground truth and predicted values is displayed for each model.
  • Figure 3: An example of input data for TRIPLEX from BC1 dataset. (Top) Difference between the input data used in NEM and the 25 adjacent spot images around the target spot image. The pre-defined spot image is marked with a blue boundary, while the input data for the NEM model is marked with a red boundary. The '+' within each image indicates the center coordinates. (Bottom) All input data for the same sample. The input data for TEM is marked with a blue boundary, the input data for NEM is marked with a red boundary, and the input data for GEM is marked with a green boundary. (The spot marked with the blue boundary is the target spot image.)
  • Figure 4: Overview of proposed positional encoding for histology images (APEG). We utilize the coordinates of each spot to reposition the feature token to its original location, apply convolution, and then restore it to its original shape.
  • Figure 5: Dataset summary of ST data used for cross-validation. (Left) Number of spots per sample in each dataset. The x-axis label represents each patient, with multiple samples existing for every patient. (Right) Log-transformed count values for each gene in the datasets. The 250 genes utilized in this study correspond to the top genes within the blue region.
  • ...and 8 more figures