Spatial Transcriptomics Analysis of Zero-shot Gene Expression Prediction
Yan Yang, Md Zakir Hossain, Xuesong Li, Shafin Rahman, Eric Stone
TL;DR
The paper addresses the challenge of predicting gene expression for unseen gene types in spatial transcriptomics by introducing a semantic guided network (SGN). SGN combines (i) window feature extraction and graph-based refinement, (ii) LLM-driven functionality/phenotype descriptions embedded into a gene-type projection, and (iii) zero-shot prediction via a dot product in a shared feature space, with loss that fuses MSE and PCC. Experiments on STNet and 10xProteomic show SGN achieves competitive zero-shot performance compared to state-of-the-art supervised methods, validating effective generalization to unseen genes and the benefit of including spatial context and LLM-informed descriptions. The work advances scalable, flexible prediction in ST by enabling new gene types to be predicted without re-collecting training data, aided by prompt design and optional internet-sourced domain knowledge.
Abstract
Spatial transcriptomics (ST) captures gene expression within distinct regions (i.e., windows) of a tissue slide. Traditional supervised learning frameworks applied to model ST are constrained to predicting expression from slide image windows for gene types seen during training, failing to generalize to unseen gene types. To overcome this limitation, we propose a semantic guided network (SGN), a pioneering zero-shot framework for predicting gene expression from slide image windows. Considering a gene type can be described by functionality and phenotype, we dynamically embed a gene type to a vector per its functionality and phenotype, and employ this vector to project slide image windows to gene expression in feature space, unleashing zero-shot expression prediction for unseen gene types. The gene type functionality and phenotype are queried with a carefully designed prompt from a pre-trained large language model (LLM). On standard benchmark datasets, we demonstrate competitive zero-shot performance compared to past state-of-the-art supervised learning approaches.
