Table of Contents
Fetching ...

HyperST: Hierarchical Hyperbolic Learning for Spatial Transcriptomics Prediction

Chen Zhang, Yilu An, Ying Chen, Hao Li, Xitong Ling, Lihao Liu, Junjun He, Yuxiang Lin, Zihui Wang, Rongshan Yu

TL;DR

HyperST tackles the challenge of predicting spatially resolved gene expression from histology by modeling intrinsic hierarchical structure with hyperbolic geometry. It introduces Multi-Level Representation Extractors and Hierarchical Hyperbolic Alignment (HCA and HEA) to fuse spot- and niche-level image and gene features in the Lorentz model, guiding representations with a hierarchical entailment prior. The approach achieves state-of-the-art performance across four tissues, validated by extensive ablations and a zero-shot MSI-status downstream task, and demonstrates improved localization of key biomarkers. Overall, HyperST highlights the potential of geometric deep learning to capture complex biological hierarchies in spatial omics.

Abstract

Spatial Transcriptomics (ST) merges the benefits of pathology images and gene expression, linking molecular profiles with tissue structure to analyze spot-level function comprehensively. Predicting gene expression from histology images is a cost-effective alternative to expensive ST technologies. However, existing methods mainly focus on spot-level image-to-gene matching but fail to leverage the full hierarchical structure of ST data, especially on the gene expression side, leading to incomplete image-gene alignment. Moreover, a challenge arises from the inherent information asymmetry: gene expression profiles contain more molecular details that may lack salient visual correlates in histological images, demanding a sophisticated representation learning approach to bridge this modality gap. We propose HyperST, a framework for ST prediction that learns multi-level image-gene representations by modeling the data's inherent hierarchy within hyperbolic space, a natural geometric setting for such structures. First, we design a Multi-Level Representation Extractors to capture both spot-level and niche-level representations from each modality, providing context-aware information beyond individual spot-level image-gene pairs. Second, a Hierarchical Hyperbolic Alignment module is introduced to unify these representations, performing spatial alignment while hierarchically structuring image and gene embeddings. This alignment strategy enriches the image representations with molecular semantics, significantly improving cross-modal prediction. HyperST achieves state-of-the-art performance on four public datasets from different tissues, paving the way for more scalable and accurate spatial transcriptomics prediction.

HyperST: Hierarchical Hyperbolic Learning for Spatial Transcriptomics Prediction

TL;DR

HyperST tackles the challenge of predicting spatially resolved gene expression from histology by modeling intrinsic hierarchical structure with hyperbolic geometry. It introduces Multi-Level Representation Extractors and Hierarchical Hyperbolic Alignment (HCA and HEA) to fuse spot- and niche-level image and gene features in the Lorentz model, guiding representations with a hierarchical entailment prior. The approach achieves state-of-the-art performance across four tissues, validated by extensive ablations and a zero-shot MSI-status downstream task, and demonstrates improved localization of key biomarkers. Overall, HyperST highlights the potential of geometric deep learning to capture complex biological hierarchies in spatial omics.

Abstract

Spatial Transcriptomics (ST) merges the benefits of pathology images and gene expression, linking molecular profiles with tissue structure to analyze spot-level function comprehensively. Predicting gene expression from histology images is a cost-effective alternative to expensive ST technologies. However, existing methods mainly focus on spot-level image-to-gene matching but fail to leverage the full hierarchical structure of ST data, especially on the gene expression side, leading to incomplete image-gene alignment. Moreover, a challenge arises from the inherent information asymmetry: gene expression profiles contain more molecular details that may lack salient visual correlates in histological images, demanding a sophisticated representation learning approach to bridge this modality gap. We propose HyperST, a framework for ST prediction that learns multi-level image-gene representations by modeling the data's inherent hierarchy within hyperbolic space, a natural geometric setting for such structures. First, we design a Multi-Level Representation Extractors to capture both spot-level and niche-level representations from each modality, providing context-aware information beyond individual spot-level image-gene pairs. Second, a Hierarchical Hyperbolic Alignment module is introduced to unify these representations, performing spatial alignment while hierarchically structuring image and gene embeddings. This alignment strategy enriches the image representations with molecular semantics, significantly improving cross-modal prediction. HyperST achieves state-of-the-art performance on four public datasets from different tissues, paving the way for more scalable and accurate spatial transcriptomics prediction.

Paper Structure

This paper contains 32 sections, 11 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: ST data characteristics. (a) A WSI contains hierarchical structures and visually similar patterns may correspond to different gene expression profiles. (b) Other works mainly model ST data in Euclidean Space, which neglects niche-level gene and can lead to biased biological insights. (c) Our hyperbolic approach models hierarchies based on information specificity, where a general concept (image/spot) entails its more specific, information-rich counterpart (gene/niche), enabling more informative representation learning.
  • Figure 2: Overview of HyperST. HyperST consists of three components. (a) Multi-Level Representation Extractors capture spot- and niche-level features from both images and gene expression. (b) Hierarchical Hyperbolic Alignment module projects these features into a shared hyperbolic latent space. It uses contrastive alignment for corresponding image-gene pairs and entailment alignment to structurally regularize the latent space according to information hierarchies. (c) Gene Decoder uses the resulting aligned and context-aware image representations to predict spot-level gene expression.
  • Figure 3: Visualization of the spatial distribution of the UMOD (Top) gene and PODXL gene (Bottom) in NCBI704 sample. The color scale ranges from purple/blue (low expression) to yellow/green (high expression).
  • Figure 4: Result of ablation study on the choice of the last layers of LoRA.