Table of Contents
Fetching ...

Uncovering spatial tissue domains and cell types in spatial omics through cross-scale profiling of cellular and genomic interactions

Rui Yan, Xiaohan Xing, Xun Wang, Zixia Zhou, Md Tauhidul Islam, Lei Xing

TL;DR

This work tackles the challenge of noisy, high-dimensional spatial omics data by introducing CellScape, a dual-branch representation learning framework that jointly integrates spatial context and gene regulatory structure. It learns two embeddings, $Z_{\text{spatial}}$ and $Z_{\text{intrinsic}}$, via a spatial graph GAT encoder and an intrinsic CNN encoder, fused into a unified representation to enable accurate spatial domain segmentation and cross-sample integration. The model employs feature masking and a MIL-NCE–style contrastive objective to capture local tissue organization while preserving intracellular co-expression patterns, and it performs batch correction to support multi-sample analyses. Across diverse ST datasets, CellScape improves spatial domain delineation, reveals domain-specific cell-type compositions, and uncovers biologically meaningful patterns, including disease-associated microglial remodeling in Alzheimer's models, demonstrating broad applicability to spatial omics and potential generalization to other multi-scale biological data.

Abstract

Cellular identity and function are linked to both their intrinsic genomic makeup and extrinsic spatial context within the tissue microenvironment. Spatial transcriptomics (ST) offers an unprecedented opportunity to study this, providing in situ gene expression profiles at single-cell resolution and illuminating the spatial and functional organization of cells within tissues. However, a significant hurdle remains: ST data is inherently noisy, large, and structurally complex. This complexity makes it intractable for existing computational methods to effectively capture the interplay between spatial interactions and intrinsic genomic relationships, thus limiting our ability to discern critical biological patterns. Here, we present CellScape, a deep learning framework designed to overcome these limitations for high-performance ST data analysis and pattern discovery. CellScape jointly models cellular interactions in tissue space and genomic relationships among cells, producing comprehensive representations that seamlessly integrate spatial signals with underlying gene regulatory mechanisms. This technique uncovers biologically informative patterns that improve spatial domain segmentation and supports comprehensive spatial cellular analyses across diverse transcriptomics datasets, offering an accurate and versatile framework for deep analysis and interpretation of ST data.w

Uncovering spatial tissue domains and cell types in spatial omics through cross-scale profiling of cellular and genomic interactions

TL;DR

This work tackles the challenge of noisy, high-dimensional spatial omics data by introducing CellScape, a dual-branch representation learning framework that jointly integrates spatial context and gene regulatory structure. It learns two embeddings, and , via a spatial graph GAT encoder and an intrinsic CNN encoder, fused into a unified representation to enable accurate spatial domain segmentation and cross-sample integration. The model employs feature masking and a MIL-NCE–style contrastive objective to capture local tissue organization while preserving intracellular co-expression patterns, and it performs batch correction to support multi-sample analyses. Across diverse ST datasets, CellScape improves spatial domain delineation, reveals domain-specific cell-type compositions, and uncovers biologically meaningful patterns, including disease-associated microglial remodeling in Alzheimer's models, demonstrating broad applicability to spatial omics and potential generalization to other multi-scale biological data.

Abstract

Cellular identity and function are linked to both their intrinsic genomic makeup and extrinsic spatial context within the tissue microenvironment. Spatial transcriptomics (ST) offers an unprecedented opportunity to study this, providing in situ gene expression profiles at single-cell resolution and illuminating the spatial and functional organization of cells within tissues. However, a significant hurdle remains: ST data is inherently noisy, large, and structurally complex. This complexity makes it intractable for existing computational methods to effectively capture the interplay between spatial interactions and intrinsic genomic relationships, thus limiting our ability to discern critical biological patterns. Here, we present CellScape, a deep learning framework designed to overcome these limitations for high-performance ST data analysis and pattern discovery. CellScape jointly models cellular interactions in tissue space and genomic relationships among cells, producing comprehensive representations that seamlessly integrate spatial signals with underlying gene regulatory mechanisms. This technique uncovers biologically informative patterns that improve spatial domain segmentation and supports comprehensive spatial cellular analyses across diverse transcriptomics datasets, offering an accurate and versatile framework for deep analysis and interpretation of ST data.w
Paper Structure (3 sections, 13 equations, 5 figures)

This paper contains 3 sections, 13 equations, 5 figures.

Figures (5)

  • Figure 1: CellScape characterizes cells via joint modeling of spatial and genomic interactions.a, Workflow of CellScape. Given a gene expression matrix and spatial coordinates for each cell in the tissue, CellScape constructs a cell graph based on spatial proximity and a 2D gene expression map reflecting gene co-expression patterns. Its dual-branch architecture employs two encoders: one captures spatial interactions and produces spatial embeddings $Z_{\text{spatial}}$, and the other captures intrinsic gene expression structure that produces intrinsic embeddings $Z_{\text{intrinsic}}$. Each representation provides a distinct view of the cellular landscape and can be leveraged for task-specific analyses. b, CellScape enables various downstream tasks for spatial omics data analysis.
  • Figure 2: CellScape maps spatial domains and cell types in the human cortex.a, Spatial transcriptomics of human cortex profiled using Slide-tags, integrated with matched snRNA-seq. b, Nissl-stained image of an adjacent tissue section. c, Spatial distribution of cells colored by cell type annotations from snRNA-seq. d, UMAP visualization of CellScape-learned intrinsic embeddings, colored by annotated cell types. e, Spatial domains (left) and UMAP visualization of spatial embeddings (right) derived by CellScape; L1-6, cortical layers 1-6; WM, white matter. f, Composition of major cell types across spatial domains identified by CellScape. g, Spatial domains (left) and UMAP visualization of spatial embeddings (right) for excitatory neurons, both derived by CellScape. h, Domain-specific marker gene expression across excitatory neuron subpopulations. Dot size indicates the fraction of expressing cells; color reflects log fold change. i, Spatial domains (left) and UMAP visualization of spatial embeddings (right) for astrocytes, both derived by CellScape; GM, grey matter; WM, white matter. j, Spatial distribution of representative domain-specific marker genes for excitatory neuron subpopulations identified in h.
  • Figure 3: CellScape reveals spatial and molecular alterations associated with Alzheimer’s disease in the mouse brain.a, STARmap PLUS dataset contains coronal sections from Alzheimer’s disease (AD) mice and age-matched controls at 8 and 13 months. A reference image from the Allen Mouse Brain Atlas marks the anatomical region analyzed. b, Spatial mapping of cells colored by cell types (top) and CellScape-defined spatial domains (bottom). CellScape consistently identifies anatomically coherent regions, including hippocampal subfields (CA1–CA3, CAslm), dentate gyrus (DG), cortical layers (L2–6), retrosplenial cortex regions (RSP-a, RSP-b), and white matter (WM). c, UMAP visualization (left) and PAGA graph (right) of CellScape spatial embeddings. d, UMAP of the four analyzed sections colored by sample. e, Composition of major cell types across spatial domains. f, Disease-associated shifts in cell-type composition at 13 months ($\Delta$= AD – control). g, Volcano plot of genes differentially expressed in microglia (AD vs control, 13 months); red, up-regulated and blue, down-regulated in AD. h, Gene Ontology enrichment for AD-up-regulated microglial genes. i, Spatial maps of DAM markers Cst7 and Trem2 in a 13-month AD section; high expression colocalizes with $A\beta$ plaques (inset).
  • Figure 4: CellScape identifies spatial domains and spatially variable genes for the mouse olfactory bulb.a, Mouse olfactory bulb profiled using Slide-seqV2 and Stereo-seq, with anatomical layers annotated based on the Allen Reference Atlas. b, Spatial domain maps predicted by SEDR, STAGATE, and CellScape for Slide-seqV2 (top) and Stereo-seq (bottom). CellScape domains correspond to the olfactory nerve layer (ONL), glomerular layer (GL), external plexiform layer (EPL), mitral-cell layer (MCL), internal plexiform layer (IPL), granule-cell sublayers (GCL_a, GCL_b), rostral migratory stream (RMS), accessory olfactory bulb (AOB) and its granular layer (AOBgr). c, Enlarged view of CellScape-predicted domains in Stereo-seq. d, Individual display of each CellScape-identified domain in Stereo-seq. e, UMAP visualization of embeddings from SEDR, STAGATE, and CellScape; only CellScape preserves the ordered layer continuum. f, Domain-specific marker gene expression across CellScape domains (dot size, fraction of spots; color, log fold-change). g, Spatial distribution of representative marker genes associated with distinct olfactory bulb layers.
  • Figure 5: CellScape accurately identifies spatial domains and enables multi-sample integration.a, Manual annotations for the osmFISH mouse somatosensory cortex dataset. b, Predicted spatial domains for the osmFISH dataset by CellScape, CellScape_CCI, and eight baseline methods. Cluster labels were matched to the closest reference domains. c, Quantitative comparison of segmentation performance on the osmFISH dataset across ten runs with different random seeds. Box plots display the distribution of results (center line: mean; box limits: upper and lower quartiles; whiskers: 1.5$\times$ interquartile range). d, Manual annotations for the 10x Genomics Visium human DLPFC dataset (Slice 151673). e, Predicted spatial domains for Slice 151673 by each method. f, Aggregated performance across all twelve DLPFC slices. g, Manual annotations for the STARmap mouse medial prefrontal cortex dataset (three slices). h, Predicted spatial domains for STARmap Slice 1 by each method. i, Predicted spatial domains for STARmap Slices 2 and 3 by CellScape. j, Quantitative evaluation of segmentation on STARmap Slice 1, summarized as in panel c. Statistical significance was assessed by two-sided t-test; **, ***, **** indicate $p < 0.01$, $< 0.001$, and $< 0.0001$, respectively. k, UMAP visualization of cell embeddings learned by CellScape (left) and GraphST (right), colored by spatial domains and slice identity, respectively.