Table of Contents
Fetching ...

Towards Spatial Transcriptomics-guided Pathological Image Recognition with Batch-Agnostic Encoder

Kazuya Nishimura, Ryoma Bise, Yasuhiro Kojima

TL;DR

This work addresses the challenge of batch effects in spatial transcriptomics when performing patch-level image subtype recognition by proposing a batch-agnostic contrastive learning framework. A two-stage approach first learns a batch-agnostic gene encoder via variational inference (scVI/scANVI) to produce consistent gene representations across patients, then couples this encoder with a fixed image encoder in a contrastive learning setup using symmetric InfoNCE losses. On a public HER2+ breast cancer ST dataset, the method improves over CLIP and BLEEP baselines, with best performance when using scANVI and a loss variant that leverages inter-modality similarities, though large-scale Imagenet-pretrained baselines still lead due to data scale. The study demonstrates the value of correcting ST batch effects for robust multi-modal pathology and highlights the need for larger, more diverse paired datasets to fully realize the potential of ST-guided image recognition.

Abstract

Spatial transcriptomics (ST) is a novel technique that simultaneously captures pathological images and gene expression profiling with spatial coordinates. Since ST is closely related to pathological features such as disease subtypes, it may be valuable to augment image representation with pathological information. However, there are no attempts to leverage ST for image recognition ({\it i.e,} patch-level classification of subtypes of pathological image.). One of the big challenges is significant batch effects in spatial transcriptomics that make it difficult to extract pathological features of images from ST. In this paper, we propose a batch-agnostic contrastive learning framework that can extract consistent signals from gene expression of ST in multiple patients. To extract consistent signals from ST, we utilize the batch-agnostic gene encoder that is trained in a variational inference manner. Experiments demonstrated the effectiveness of our framework on a publicly available dataset. Code is publicly available at https://github.com/naivete5656/TPIRBAE

Towards Spatial Transcriptomics-guided Pathological Image Recognition with Batch-Agnostic Encoder

TL;DR

This work addresses the challenge of batch effects in spatial transcriptomics when performing patch-level image subtype recognition by proposing a batch-agnostic contrastive learning framework. A two-stage approach first learns a batch-agnostic gene encoder via variational inference (scVI/scANVI) to produce consistent gene representations across patients, then couples this encoder with a fixed image encoder in a contrastive learning setup using symmetric InfoNCE losses. On a public HER2+ breast cancer ST dataset, the method improves over CLIP and BLEEP baselines, with best performance when using scANVI and a loss variant that leverages inter-modality similarities, though large-scale Imagenet-pretrained baselines still lead due to data scale. The study demonstrates the value of correcting ST batch effects for robust multi-modal pathology and highlights the need for larger, more diverse paired datasets to fully realize the potential of ST-guided image recognition.

Abstract

Spatial transcriptomics (ST) is a novel technique that simultaneously captures pathological images and gene expression profiling with spatial coordinates. Since ST is closely related to pathological features such as disease subtypes, it may be valuable to augment image representation with pathological information. However, there are no attempts to leverage ST for image recognition ({\it i.e,} patch-level classification of subtypes of pathological image.). One of the big challenges is significant batch effects in spatial transcriptomics that make it difficult to extract pathological features of images from ST. In this paper, we propose a batch-agnostic contrastive learning framework that can extract consistent signals from gene expression of ST in multiple patients. To extract consistent signals from ST, we utilize the batch-agnostic gene encoder that is trained in a variational inference manner. Experiments demonstrated the effectiveness of our framework on a publicly available dataset. Code is publicly available at https://github.com/naivete5656/TPIRBAE

Paper Structure

This paper contains 11 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: (a) A pair of a whole slide image and spatial transcriptomics. Gene expression is captured on spots, which contain the expression levels of tens of thousands of genes and with spatial coordinates. (b) Visualization of data distribution using UMAP projection on her2st dataset andersson2020spatial. Left and right are colorized by subtype class labels for image and patient labels.
  • Figure 2: Overview of our method. scVI or scANVI are trained with the gene expression, and the encoder is reused for contrastive learning.
  • Figure 3: Visualization of feature space of our method with scVI and scANVI with $L_\mathrm{SI}$ colorized by the subtype label.