SpaDiT: Diffusion Transformer for Spatial Gene Expression Prediction using scRNA-seq
Xiaoyu Li, Fangfang Zhu, Wenwen Min
TL;DR
SpaDiT tackles the problem of recovering undetected genes in spatial transcriptomics by casting ST gene prediction as a conditional diffusion problem guided by scRNA-seq data. It introduces a Transformer-based diffusion backbone with latent and conditional embeddings, enabling precise generation of ST expressions and preservation of spatial patterns. Across ten paired ST/scRNA-seq datasets and five evaluation metrics, SpaDiT achieves state-of-the-art performance and robust spatial/gene-structure fidelity, highlighting its potential to enhance resolution and interpretability of spatial transcriptomics. The approach demonstrates the practical impact of diffusion-based generative modeling in genomics, offering a scalable framework for integrating multi-omics data to enrich spatial gene expression analyses.
Abstract
The rapid development of spatial transcriptomics (ST) technologies is revolutionizing our understanding of the spatial organization of biological tissues. Current ST methods, categorized into next-generation sequencing-based (seq-based) and fluorescence in situ hybridization-based (image-based) methods, offer innovative insights into the functional dynamics of biological tissues. However, these methods are limited by their cellular resolution and the quantity of genes they can detect. To address these limitations, we propose SpaDiT, a deep learning method that utilizes a diffusion generative model to integrate scRNA-seq and ST data for the prediction of undetected genes. By employing a Transformer-based diffusion model, SpaDiT not only accurately predicts unknown genes but also effectively generates the spatial structure of ST genes. We have demonstrated the effectiveness of SpaDiT through extensive experiments on both seq-based and image-based ST data. SpaDiT significantly contributes to ST gene prediction methods with its innovative approach. Compared to eight leading baseline methods, SpaDiT achieved state-of-the-art performance across multiple metrics, highlighting its substantial bioinformatics contribution.
