Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics
Kazuya Nishimura, Haruka Hirose, Ryoma Bise, Kaito Shiku, Yasuhiro Kojima
TL;DR
The paper tackles the high cost of spatial transcriptomics by predicting gene expression from pathology images, but notes that absolute expression estimates are fragile to batch effects and noise.It proposes STRank, a probabilistic learning-to-rank loss that models relative expression patterns across patches using pairwise (Binomial) and listwise (Multinomial) distributions, with corrections for sparse counts.Through synthetic experiments and seven real-spatial datasets (HEST-1k), STRank demonstrates improved robustness to batch effects and sparsity compared with traditional losses, though gains vary by dataset and setting.Overall, the work provides a principled framework for focusing on relative gene expression trends to improve reliability and transferability in spatial transcriptomics estimation.
Abstract
Gene expression estimation from pathology images has the potential to reduce the RNA sequencing cost. Point-wise loss functions have been widely used to minimize the discrepancy between predicted and absolute gene expression values. However, due to the complexity of the sequencing techniques and intrinsic variability across cells, the observed gene expression contains stochastic noise and batch effects, and estimating the absolute expression values accurately remains a significant challenge. To mitigate this, we propose a novel objective of learning relative expression patterns rather than absolute levels. We assume that the relative expression levels of genes exhibit consistent patterns across independent experiments, even when absolute expression values are affected by batch effects and stochastic noise in tissue samples. Based on the assumption, we model the relation and propose a novel loss function called STRank that is robust to noise and batch effects. Experiments using synthetic datasets and real datasets demonstrate the effectiveness of the proposed method. The code is available at https://github.com/naivete5656/STRank.
