Table of Contents
Fetching ...

SPATIA: Multimodal Generation and Prediction of Spatial Cell Phenotypes

Zhenglun Kong, Mufan Qiu, John Boesen, Xiang Lin, Sukwon Yun, Tianlong Chen, Manolis Kellis, Marinka Zitnik

TL;DR

This work introduces SPATIA, a multi-level generative and predictive model that learns unified, spatially aware representations by fusing morphology, gene expression, and spatial context from the cell to the tissue level, and incorporates a novel spatially conditioned generative framework for predicting cell morphologies under perturbations.

Abstract

Understanding how cellular morphology, gene expression, and spatial context jointly shape tissue function is a central challenge in biology. Image-based spatial transcriptomics technologies now provide high-resolution measurements of cell images and gene expression profiles, but existing methods typically analyze these modalities in isolation or at limited resolution. We address the problem by introducing SPATIA, a multi-level generative and predictive model that learns unified, spatially aware representations by fusing morphology, gene expression, and spatial context from the cell to the tissue level. SPATIA also incorporates a novel spatially conditioned generative framework for predicting cell morphologies under perturbations. Specifically, we propose a confidence-aware flow matching objective that reweights weak optimal-transport pairs based on uncertainty. We further apply morphology-profile alignment to encourage biologically meaningful image generation, enabling the modeling of microenvironment-dependent phenotypic transitions. We assembled a multi-scale dataset consisting of 25.9 million cell-gene pairs across 17 tissues. We benchmark SPATIA against 18 models across 12 tasks, spanning categories such as phenotype generation, annotation, clustering, gene imputation, and cross-modal prediction. SPATIA achieves improved performance over state-of-the-art models, improving generative fidelity by 8% and predictive accuracy by up to 3%.

SPATIA: Multimodal Generation and Prediction of Spatial Cell Phenotypes

TL;DR

This work introduces SPATIA, a multi-level generative and predictive model that learns unified, spatially aware representations by fusing morphology, gene expression, and spatial context from the cell to the tissue level, and incorporates a novel spatially conditioned generative framework for predicting cell morphologies under perturbations.

Abstract

Understanding how cellular morphology, gene expression, and spatial context jointly shape tissue function is a central challenge in biology. Image-based spatial transcriptomics technologies now provide high-resolution measurements of cell images and gene expression profiles, but existing methods typically analyze these modalities in isolation or at limited resolution. We address the problem by introducing SPATIA, a multi-level generative and predictive model that learns unified, spatially aware representations by fusing morphology, gene expression, and spatial context from the cell to the tissue level. SPATIA also incorporates a novel spatially conditioned generative framework for predicting cell morphologies under perturbations. Specifically, we propose a confidence-aware flow matching objective that reweights weak optimal-transport pairs based on uncertainty. We further apply morphology-profile alignment to encourage biologically meaningful image generation, enabling the modeling of microenvironment-dependent phenotypic transitions. We assembled a multi-scale dataset consisting of 25.9 million cell-gene pairs across 17 tissues. We benchmark SPATIA against 18 models across 12 tasks, spanning categories such as phenotype generation, annotation, clustering, gene imputation, and cross-modal prediction. SPATIA achieves improved performance over state-of-the-art models, improving generative fidelity by 8% and predictive accuracy by up to 3%.

Paper Structure

This paper contains 27 sections, 16 equations, 7 figures, 9 tables, 1 algorithm.

Figures (7)

  • Figure 1: Spatia is a multi-scale spatial model for predictive and generative tasks. Examples of real MIST data are shown in Fig. \ref{['fig:level']}.
  • Figure 2: Example of the three levels of MIST dataset.
  • Figure 3: A) Tissue distribution of our MIST dataset. B) A landscape showcasing the variety of disease states in MIST. C) MIST contains four platforms containing different tissue and organ types. D) Overview of Spatia. E) Processing control–target pairs with optimal transport. F) Our conditional contrastive flow matching approach for predicting cell morphology. G) Downstream task performance gain compared to existing models.
  • Figure 4: Qualitative image analysis of generated cell morphology change images with the target image.
  • Figure 5: Batch-effect mitigation. Different colors represent different datasets or sample sources.
  • ...and 2 more figures