Towards Spatial Transcriptomics-guided Pathological Image Recognition with Batch-Agnostic Encoder
Kazuya Nishimura, Ryoma Bise, Yasuhiro Kojima
TL;DR
This work addresses the challenge of batch effects in spatial transcriptomics when performing patch-level image subtype recognition by proposing a batch-agnostic contrastive learning framework. A two-stage approach first learns a batch-agnostic gene encoder via variational inference (scVI/scANVI) to produce consistent gene representations across patients, then couples this encoder with a fixed image encoder in a contrastive learning setup using symmetric InfoNCE losses. On a public HER2+ breast cancer ST dataset, the method improves over CLIP and BLEEP baselines, with best performance when using scANVI and a loss variant that leverages inter-modality similarities, though large-scale Imagenet-pretrained baselines still lead due to data scale. The study demonstrates the value of correcting ST batch effects for robust multi-modal pathology and highlights the need for larger, more diverse paired datasets to fully realize the potential of ST-guided image recognition.
Abstract
Spatial transcriptomics (ST) is a novel technique that simultaneously captures pathological images and gene expression profiling with spatial coordinates. Since ST is closely related to pathological features such as disease subtypes, it may be valuable to augment image representation with pathological information. However, there are no attempts to leverage ST for image recognition ({\it i.e,} patch-level classification of subtypes of pathological image.). One of the big challenges is significant batch effects in spatial transcriptomics that make it difficult to extract pathological features of images from ST. In this paper, we propose a batch-agnostic contrastive learning framework that can extract consistent signals from gene expression of ST in multiple patients. To extract consistent signals from ST, we utilize the batch-agnostic gene encoder that is trained in a variational inference manner. Experiments demonstrated the effectiveness of our framework on a publicly available dataset. Code is publicly available at https://github.com/naivete5656/TPIRBAE
