CausalGeD: Blending Causality and Diffusion for Spatial Gene Expression Generation
Rabeya Tus Sadia, Md Atik Ahamed, Qiang Cheng
TL;DR
CausalGeD tackles the problem of integrating spatial transcriptomics with scRNA-seq by explicitly modeling gene-gene causal relationships. It introduces a diffusion-based generator augmented by a Causality-Aware Transformer (CAT) that blends autoregression with diffusion to capture regulatory dependencies without predefined networks. Across ten tissue datasets, CausalGeD achieves state-of-the-art performance (5–32% improvements in key metrics like PCC and SSIM) and preserves both global structure and local regulatory signals, aided by a two-headed encoder and a causally masked attention mechanism. This work advances practical spatial gene expression prediction and offers deeper biological insights into gene regulation within spatial contexts, with potential implications for understanding tissue organization and disease progression.
Abstract
The integration of single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) data is crucial for understanding gene expression in spatial context. Existing methods for such integration have limited performance, with structural similarity often below 60\%, We attribute this limitation to the failure to consider causal relationships between genes. We present CausalGeD, which combines diffusion and autoregressive processes to leverage these relationships. By generalizing the Causal Attention Transformer from image generation to gene expression data, our model captures regulatory mechanisms without predefined relationships. Across 10 tissue datasets, CausalGeD outperformed state-of-the-art baselines by 5- 32\% in key metrics, including Pearson's correlation and structural similarity, advancing both technical and biological insights.
