Table of Contents
Fetching ...

Multi-modal Spatial Clustering for Spatial Transcriptomics Utilizing High-resolution Histology Images

Bingjun Li, Mostafa Karami, Masum Shah Junayed, Sheida Nabavi

TL;DR

The spatial transcriptomics multimodal clustering (stMMC) model is proposed, a novel contrastive learningbased deep learning approach that integrates gene expression data with histology image features through a multi-modal parallel graph autoencoder.

Abstract

Understanding the intricate cellular environment within biological tissues is crucial for uncovering insights into complex biological functions. While single-cell RNA sequencing has significantly enhanced our understanding of cellular states, it lacks the spatial context necessary to fully comprehend the cellular environment. Spatial transcriptomics (ST) addresses this limitation by enabling transcriptome-wide gene expression profiling while preserving spatial context. One of the principal challenges in ST data analysis is spatial clustering, which reveals spatial domains based on the spots within a tissue. Modern ST sequencing procedures typically include a high-resolution histology image, which has been shown in previous studies to be closely connected to gene expression profiles. However, current spatial clustering methods often fail to fully integrate high-resolution histology image features with gene expression data, limiting their ability to capture critical spatial and cellular interactions. In this study, we propose the spatial transcriptomics multi-modal clustering (stMMC) model, a novel contrastive learning-based deep learning approach that integrates gene expression data with histology image features through a multi-modal parallel graph autoencoder. We tested stMMC against four state-of-the-art baseline models: Leiden, GraphST, SpaGCN, and stLearn on two public ST datasets with 13 sample slices in total. The experiments demonstrated that stMMC outperforms all the baseline models in terms of ARI and NMI. An ablation study further validated the contributions of contrastive learning and the incorporation of histology image features.

Multi-modal Spatial Clustering for Spatial Transcriptomics Utilizing High-resolution Histology Images

TL;DR

The spatial transcriptomics multimodal clustering (stMMC) model is proposed, a novel contrastive learningbased deep learning approach that integrates gene expression data with histology image features through a multi-modal parallel graph autoencoder.

Abstract

Understanding the intricate cellular environment within biological tissues is crucial for uncovering insights into complex biological functions. While single-cell RNA sequencing has significantly enhanced our understanding of cellular states, it lacks the spatial context necessary to fully comprehend the cellular environment. Spatial transcriptomics (ST) addresses this limitation by enabling transcriptome-wide gene expression profiling while preserving spatial context. One of the principal challenges in ST data analysis is spatial clustering, which reveals spatial domains based on the spots within a tissue. Modern ST sequencing procedures typically include a high-resolution histology image, which has been shown in previous studies to be closely connected to gene expression profiles. However, current spatial clustering methods often fail to fully integrate high-resolution histology image features with gene expression data, limiting their ability to capture critical spatial and cellular interactions. In this study, we propose the spatial transcriptomics multi-modal clustering (stMMC) model, a novel contrastive learning-based deep learning approach that integrates gene expression data with histology image features through a multi-modal parallel graph autoencoder. We tested stMMC against four state-of-the-art baseline models: Leiden, GraphST, SpaGCN, and stLearn on two public ST datasets with 13 sample slices in total. The experiments demonstrated that stMMC outperforms all the baseline models in terms of ARI and NMI. An ablation study further validated the contributions of contrastive learning and the incorporation of histology image features.

Paper Structure

This paper contains 13 sections, 8 equations, 16 figures, 2 tables.

Figures (16)

  • Figure 1: The overall structure of the proposed model, stMMC is plotted here, where trapezoids represent the GCN layer, and rectangles represent extracted features. Dashed lines with double arrowheads represent that both GCNs share the same weight. stMMC takes two data modalities and passes them through the multi-modal parallel graph autoencoder (MPGA), where each modality is regulated by a contrastive learning mechanism. The detailed process of contrastive learning is shown in Figure \ref{['fig:contrastive_learning']}. The MPGA reconstructs a refined gene expression data, which is then used for spatial clustering.
  • Figure 2: The detailed process of contrastive learning mechanism for a random spot on any modality is plotted here, where the top row is the corrupted graph, the bottom row is the original graph, and there is three steps of the contrastive learning mechanism: 1) obtaining the learned spot feature by GCN; 2) computing the original local community representation and the corrupted one; 3) assigning positive pairs to the learned features and the community representation from the same graph, and negative pair to the learned feature and the community representation from different graphs. The positive pair is shown in blue and negative pair is shown in red.
  • Figure 3: The ARI scores of stMMC and all four baseline models on DLPFC datasets are plotted here, where the $Y$ axis is the ARI score and the $X$ axis is the data slice number.
  • Figure 4: The NMI scores of stMMC and all four baseline models on DLPFC datasets are plotted here, where the $Y$ axis is the NMI score and the $X$ axis is the data slice number.
  • Figure 5: Ground Truth
  • ...and 11 more figures