SpaDiT: Diffusion Transformer for Spatial Gene Expression Prediction using scRNA-seq

Xiaoyu Li; Fangfang Zhu; Wenwen Min

SpaDiT: Diffusion Transformer for Spatial Gene Expression Prediction using scRNA-seq

Xiaoyu Li, Fangfang Zhu, Wenwen Min

TL;DR

SpaDiT tackles the problem of recovering undetected genes in spatial transcriptomics by casting ST gene prediction as a conditional diffusion problem guided by scRNA-seq data. It introduces a Transformer-based diffusion backbone with latent and conditional embeddings, enabling precise generation of ST expressions and preservation of spatial patterns. Across ten paired ST/scRNA-seq datasets and five evaluation metrics, SpaDiT achieves state-of-the-art performance and robust spatial/gene-structure fidelity, highlighting its potential to enhance resolution and interpretability of spatial transcriptomics. The approach demonstrates the practical impact of diffusion-based generative modeling in genomics, offering a scalable framework for integrating multi-omics data to enrich spatial gene expression analyses.

Abstract

The rapid development of spatial transcriptomics (ST) technologies is revolutionizing our understanding of the spatial organization of biological tissues. Current ST methods, categorized into next-generation sequencing-based (seq-based) and fluorescence in situ hybridization-based (image-based) methods, offer innovative insights into the functional dynamics of biological tissues. However, these methods are limited by their cellular resolution and the quantity of genes they can detect. To address these limitations, we propose SpaDiT, a deep learning method that utilizes a diffusion generative model to integrate scRNA-seq and ST data for the prediction of undetected genes. By employing a Transformer-based diffusion model, SpaDiT not only accurately predicts unknown genes but also effectively generates the spatial structure of ST genes. We have demonstrated the effectiveness of SpaDiT through extensive experiments on both seq-based and image-based ST data. SpaDiT significantly contributes to ST gene prediction methods with its innovative approach. Compared to eight leading baseline methods, SpaDiT achieved state-of-the-art performance across multiple metrics, highlighting its substantial bioinformatics contribution.

SpaDiT: Diffusion Transformer for Spatial Gene Expression Prediction using scRNA-seq

TL;DR

Abstract

Paper Structure (19 sections, 10 equations, 6 figures, 4 tables, 2 algorithms)

This paper contains 19 sections, 10 equations, 6 figures, 4 tables, 2 algorithms.

Introduction
Materials and methods
Datasets and pre-processing
The architeture of SpaDiT
Latent Embedding in SpaDiT
Condition Embedding in SpaDiT
Diffusion with Transformer in SpaDiT
Training phase in SpaDiT
Inference phase in SpaDiT
Evaluation metrics
Baselines
Results
SpaDiT improves prediction accuracy of spatial gene expression
SpaDiT enhances the similarity of predicted gene expression in high-dimensional space
SpaDiT preserves the similarity between genes
...and 4 more sections

Figures (6)

Figure 1: The architecture of SpaDiT. There are three parts in total: latent embedding, conditional embedding and network backbone. (A) is the training process where each gene is considered as a sample, and (B) is the inference process.
Figure 2: Performance evaluation is based on the comprehensive metric of Accuracy Score (AS) on ten real paired ST and scRNA-seq datasets. Accuracy Score (AS) is a comprehensive indicator for evaluating model performance. The definition can be found in \ref{['metrics']}. The central line represents the median, the box depicts the interquartile range, whiskers extend to 1.5 times the interquartile range, and dots represent the AS of individual datasets.
Figure 3: UMAP plots illustrating gene predicted by SpaDiT,Tangram, scVI, SpaGE, stPlus, SpaOTsc, novoSpaRc, SpatialScope and stDiff. The closer the two scatter points are, the better the prediction effect is. The scatter points predicted by SpaDiT and the real scatter points almost overlap, indicating that the genes predicted by SpaDiT are closer to the real genes.
Figure 4: Visualization of the prediction performance of various baseline methods. The first column of the figure shows the results after clustering the true labels. The closer the predicted results of each method are to the true labels, the better the effect. The clustering effect of SpaDiT is closest to the true labels.
Figure 5: Predicted expression abundance of genes with known spatial patterns in four datasets. Each column corresponds to a gene with a clear spatial pattern. The first column represents the spatial pattern genes with true labels. Subsequent columns show the corresponding predicted expression patterns obtained by using SpaDiT, Tangram, scVI, SpaGE, stPlus, SpaOTsc, novoSpaRc, SpatialScope, and stDiff.
...and 1 more figures

SpaDiT: Diffusion Transformer for Spatial Gene Expression Prediction using scRNA-seq

TL;DR

Abstract

SpaDiT: Diffusion Transformer for Spatial Gene Expression Prediction using scRNA-seq

Authors

TL;DR

Abstract

Table of Contents

Figures (6)