Table of Contents
Fetching ...

High-Resolution Spatial Transcriptomics from Histology Images using HisToSGE

Zhiceng Shi, Shuailin Xue, Fangfang Zhu, Wenwen Min

TL;DR

This work tackles the challenge of generating high-resolution spatial gene expression from histology by introducing HisToSGE, a model that fuses rich histology features from a Pathology Image Large Model with a learnable spatial feature encoder. The architecture comprises a two-module design: a feature extraction block that creates multimodal patch features and a feature learning block based on multi-head attention to integrate spot coordinates, followed by gene projection heads to produce expression profiles. Empirically, HisToSGE achieves state-of-the-art performance across four ST datasets, improving $PCC$ by up to ~32% and reducing $MSE$ and $MAE$ relative to strong baselines, while better preserving spatial domains and enhancing marker-gene patterns. The results highlight the practical impact of leveraging large-scale histology representations and attention-based fusion for accurate, high-resolution spatial transcriptomics analyses.

Abstract

Spatial transcriptomics (ST) is a groundbreaking genomic technology that enables spatial localization analysis of gene expression within tissue sections. However, it is significantly limited by high costs and sparse spatial resolution. An alternative, more cost-effective strategy is to use deep learning methods to predict high-density gene expression profiles from histological images. However, existing methods struggle to capture rich image features effectively or rely on low-dimensional positional coordinates, making it difficult to accurately predict high-resolution gene expression profiles. To address these limitations, we developed HisToSGE, a method that employs a Pathology Image Large Model (PILM) to extract rich image features from histological images and utilizes a feature learning module to robustly generate high-resolution gene expression profiles. We evaluated HisToSGE on four ST datasets, comparing its performance with five state-of-the-art baseline methods. The results demonstrate that HisToSGE excels in generating high-resolution gene expression profiles and performing downstream tasks such as spatial domain identification. All code and public datasets used in this paper are available at https://github.com/wenwenmin/HisToSGE and https://zenodo.org/records/12792163.

High-Resolution Spatial Transcriptomics from Histology Images using HisToSGE

TL;DR

This work tackles the challenge of generating high-resolution spatial gene expression from histology by introducing HisToSGE, a model that fuses rich histology features from a Pathology Image Large Model with a learnable spatial feature encoder. The architecture comprises a two-module design: a feature extraction block that creates multimodal patch features and a feature learning block based on multi-head attention to integrate spot coordinates, followed by gene projection heads to produce expression profiles. Empirically, HisToSGE achieves state-of-the-art performance across four ST datasets, improving by up to ~32% and reducing and relative to strong baselines, while better preserving spatial domains and enhancing marker-gene patterns. The results highlight the practical impact of leveraging large-scale histology representations and attention-based fusion for accurate, high-resolution spatial transcriptomics analyses.

Abstract

Spatial transcriptomics (ST) is a groundbreaking genomic technology that enables spatial localization analysis of gene expression within tissue sections. However, it is significantly limited by high costs and sparse spatial resolution. An alternative, more cost-effective strategy is to use deep learning methods to predict high-density gene expression profiles from histological images. However, existing methods struggle to capture rich image features effectively or rely on low-dimensional positional coordinates, making it difficult to accurately predict high-resolution gene expression profiles. To address these limitations, we developed HisToSGE, a method that employs a Pathology Image Large Model (PILM) to extract rich image features from histological images and utilizes a feature learning module to robustly generate high-resolution gene expression profiles. We evaluated HisToSGE on four ST datasets, comparing its performance with five state-of-the-art baseline methods. The results demonstrate that HisToSGE excels in generating high-resolution gene expression profiles and performing downstream tasks such as spatial domain identification. All code and public datasets used in this paper are available at https://github.com/wenwenmin/HisToSGE and https://zenodo.org/records/12792163.
Paper Structure (18 sections, 12 equations, 5 figures, 3 tables)

This paper contains 18 sections, 12 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The network architecture of our proposed HisToSGE method. (A) The backbone network of HisToSGE aims to predict high-resolution gene expression. During training, this network processes histological images through feature extraction and learning modules to associate spatial gene expression patterns at the original resolution. During testing, the histological images are downsampled, and the trained backbone network predicts high-resolution gene expression patterns. (B) The feature extraction module generates multimodal feature maps that include RGB, positional, and histological features. (C) The feature learning module utilizes a multi-head attention mechanism to integrate spot coordinates and learn features from the multimodal feature maps, thereby enhancing feature representation.
  • Figure 2: HisToSGE excels in generating gene expression profiles in tissues. (A) For visualizing the marker genes PCP4, FABP4, and MBP in downsampled and generated data, methods like HisToSGE, STAGE, THItoGene, HisToGene, DeepSpaCE, and STNet were used on slice 151676 from the DLPFC dataset. (B) Similarly, for visualizing the marker genes Prkcd and Mrgn in downsampled and generated data, the same methods were applied to the Mouse Brain dataset.
  • Figure 3: (A) For visualizing the marker genes MGP and CRISP3 in downsampled and generated data, methods like HisToSGE, STAGE, THItoGene, HisToGene, DeepSpaCE, and STNet were used on the BC1 dataset. (B) Similarly, for visualizing the marker genes AZGP1 and APOE in downsampled and generated data, the same methods were applied to the BC2 dataset.
  • Figure 4: HisToSGE enhances gene expression patterns in the DLPFC dataset. Spatial visualization of marker genes MOBP, SCGB2A2, HBB and PCP4 for the raw and generated data, respectively
  • Figure 5: Accuracy of spatial domains identified from raw data and data generated by HisToSGE, STAGE STAGE, THItoGene ThItogene, HisToGene HisToGene, DeepSpaCE DeepSpaCE, and ST-Net STnet on the DLPFC dataset. Performance comparison of identification accuracy by HisToSGE and other methods when employing K-means (A) , STAGATE STGATE (B) and STmask STMask (C). (D) Spatial visualization of identified spatial domains by HisToSGE and other methods for slices 151673 and 151674, respectively.