Table of Contents
Fetching ...

CASPER: Cross-modal Alignment of Spatial and single-cell Profiles for Expression Recovery

Amit Kumar, Maninder Kaur, Raghvendra Mall, Sukrit Gupta

TL;DR

CASPER tackles the limited gene coverage problem in Spatial Transcriptomics by leveraging scRNA-seq centroids and cross-modal attention to predict genome-wide expression at spatial locations. It deploys a dual-encoder architecture to map ST and centroid representations into a shared latent space and uses a cross-attention decoder to fuse context and impute unmeasured genes under a masked reconstruction loss. Across four ST datasets paired with scRNA references, CASPER outperforms four state-of-the-art baselines on multiple metrics, with significant improvements in PCC, SRCC, and MAPE, and its attention patterns provide interpretable mappings to cell-type centroids. This cross-modal translation enables more accurate spatial gene expression recovery, supporting downstream tissue mapping and biomarker discovery.

Abstract

Spatial Transcriptomics enables mapping of gene expression within its native tissue context, but current platforms measure only a limited set of genes due to experimental constraints and excessive costs. To overcome this, computational models integrate Single-Cell RNA Sequencing data with Spatial Transcriptomics to predict unmeasured genes. We propose CASPER, a cross-attention based framework that predicts unmeasured gene expression in Spatial Transcriptomics by leveraging centroid-level representations from Single-Cell RNA Sequencing. We performed rigorous testing over four state-of-the-art Spatial Transcriptomics/Single-Cell RNA Sequencing dataset pairs across four existing baseline models. CASPER shows significant improvement in nine out of the twelve metrics for our experiments. This work paves the way for further work in Spatial Transcriptomics to Single-Cell RNA Sequencing modality translation. The code for CASPER is available at https://github.com/AI4Med-Lab/CASPER.

CASPER: Cross-modal Alignment of Spatial and single-cell Profiles for Expression Recovery

TL;DR

CASPER tackles the limited gene coverage problem in Spatial Transcriptomics by leveraging scRNA-seq centroids and cross-modal attention to predict genome-wide expression at spatial locations. It deploys a dual-encoder architecture to map ST and centroid representations into a shared latent space and uses a cross-attention decoder to fuse context and impute unmeasured genes under a masked reconstruction loss. Across four ST datasets paired with scRNA references, CASPER outperforms four state-of-the-art baselines on multiple metrics, with significant improvements in PCC, SRCC, and MAPE, and its attention patterns provide interpretable mappings to cell-type centroids. This cross-modal translation enables more accurate spatial gene expression recovery, supporting downstream tissue mapping and biomarker discovery.

Abstract

Spatial Transcriptomics enables mapping of gene expression within its native tissue context, but current platforms measure only a limited set of genes due to experimental constraints and excessive costs. To overcome this, computational models integrate Single-Cell RNA Sequencing data with Spatial Transcriptomics to predict unmeasured genes. We propose CASPER, a cross-attention based framework that predicts unmeasured gene expression in Spatial Transcriptomics by leveraging centroid-level representations from Single-Cell RNA Sequencing. We performed rigorous testing over four state-of-the-art Spatial Transcriptomics/Single-Cell RNA Sequencing dataset pairs across four existing baseline models. CASPER shows significant improvement in nine out of the twelve metrics for our experiments. This work paves the way for further work in Spatial Transcriptomics to Single-Cell RNA Sequencing modality translation. The code for CASPER is available at https://github.com/AI4Med-Lab/CASPER.

Paper Structure

This paper contains 17 sections, 10 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Overview of the proposed CASPER framework for spatial gene expression imputation.
  • Figure 2: Attention-based mapping between merFISH ST spots ($\text{Spot}_x$) and scRNA-seq centroids (C$x$), where $x$ is variable.