Table of Contents
Fetching ...

Towards Universal Spatial Transcriptomics Super-Resolution: A Generalist Physically Consistent Flow Matching Framework

Xinlei Huang, Weihao Dai, Zijun Qin, Xin Yu, Di Wang, Yanran Liu, Lixin Cheng, Xubin Zheng

TL;DR

This work addresses the limitations of spatial transcriptomics super-resolution by tackling biological heterogeneity and physical inconsistency in existing methods. It introduces SRast, a dual-component framework with Structure-Aware Semantic Alignment (SASA) to learn canonical gene semantics and Latent Normalization, and Physically Constrained Flow Matching (PCFM) to perform ratio prediction on the simplex via an optimal-transport flow that preserves local mass conservation. By reformulating SRSR as ratio allocation on the simplex and leveraging a flow-based transformer with boundary-aware regularization, SRast achieves state-of-the-art zero-shot generalization across species and platforms while guaranteeing physical plausibility. The approach demonstrates linear inference scalability and robust performance across diverse tissues, enabling reliable, high-resolution insights from low-resolution spatial transcriptomics data.

Abstract

Spatial transcriptomics provides an unprecedented perspective for deciphering tissue spatial heterogeneity. However, high-resolution spatial transcriptomic technology remains constrained by limited gene coverage, technical complexity, and high cost. Existing spatial transcriptomics super-resolution methods from low resolution data suffer from two fundamental limitations: poor out-of-distribution generalization stemming from a neglect of inherent biological heterogeneity, and a lack of physical consistency. To address these challenges, we propose SRast, a novel physically constrained generalist framework designed for robust spatial transcriptomics super-resolution. To tackle heterogeneity, SRast employs a strategic decoupling architecture that explicitly decouples gene semantics representation from spatial geometry deconvolution, utilizing self-supervised learning to align latent distributions and mitigate cross-sample shifts. Regarding physical priors, SRast reformulates the task as ratio prediction on the simplex, performing a flow matching model to learn optimal transport-based geometric transformations that strictly enforce local mass conservation. Extensive experiments across diverse species, tissues, and platforms demonstrate that SRast achieves state-of-the-art performance, exhibiting superior zero-shot generalization capabilities and ensuring physical consistency in recovering fine-grained biological structures.

Towards Universal Spatial Transcriptomics Super-Resolution: A Generalist Physically Consistent Flow Matching Framework

TL;DR

This work addresses the limitations of spatial transcriptomics super-resolution by tackling biological heterogeneity and physical inconsistency in existing methods. It introduces SRast, a dual-component framework with Structure-Aware Semantic Alignment (SASA) to learn canonical gene semantics and Latent Normalization, and Physically Constrained Flow Matching (PCFM) to perform ratio prediction on the simplex via an optimal-transport flow that preserves local mass conservation. By reformulating SRSR as ratio allocation on the simplex and leveraging a flow-based transformer with boundary-aware regularization, SRast achieves state-of-the-art zero-shot generalization across species and platforms while guaranteeing physical plausibility. The approach demonstrates linear inference scalability and robust performance across diverse tissues, enabling reliable, high-resolution insights from low-resolution spatial transcriptomics data.

Abstract

Spatial transcriptomics provides an unprecedented perspective for deciphering tissue spatial heterogeneity. However, high-resolution spatial transcriptomic technology remains constrained by limited gene coverage, technical complexity, and high cost. Existing spatial transcriptomics super-resolution methods from low resolution data suffer from two fundamental limitations: poor out-of-distribution generalization stemming from a neglect of inherent biological heterogeneity, and a lack of physical consistency. To address these challenges, we propose SRast, a novel physically constrained generalist framework designed for robust spatial transcriptomics super-resolution. To tackle heterogeneity, SRast employs a strategic decoupling architecture that explicitly decouples gene semantics representation from spatial geometry deconvolution, utilizing self-supervised learning to align latent distributions and mitigate cross-sample shifts. Regarding physical priors, SRast reformulates the task as ratio prediction on the simplex, performing a flow matching model to learn optimal transport-based geometric transformations that strictly enforce local mass conservation. Extensive experiments across diverse species, tissues, and platforms demonstrate that SRast achieves state-of-the-art performance, exhibiting superior zero-shot generalization capabilities and ensuring physical consistency in recovering fine-grained biological structures.
Paper Structure (18 sections, 26 equations, 3 figures, 4 tables)

This paper contains 18 sections, 26 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Overview of our proposed SRast framework. (Left) Structure-Aware Semantic Alignment: The model employs a sample-specific GraphVAE to generate sample representations based on the Dual-Topology (DT) Graph and utilizes Latent Norm to eliminate batch effects across multi-source data, constructing a unified General Feature Pool. (Right) Physically Constrained Flow Matching: Built upon the DiT architecture, the framework injects general features and their neighborhood features as conditions. It learns the optimal transport flow from noise to high-resolution ratios via flow matching and leverages a KL divergence constraint to ensure the generated distribution aligns with the target high-resolution distribution.
  • Figure 2: Umap visualization comparison of PCA features (a), SASA stage encoded features (b), and features after Latent Norm (c) across multiple datasets.
  • Figure 3: Comparison of Running Time Cost between SRast and Baseline Methods as the Number of Inference Samples Increases