Table of Contents
Fetching ...

Detecting Spatial Dependence in Transcriptomics Data using Vectorised Persistence Diagrams

Katharina Limbeck, Bastian Rieck

TL;DR

This work tackles detecting spatial dependence in spatial transcriptomics by leveraging persistent homology (PH) and functional persistence summaries. It introduces a non-parametric one-sample permutation test based on Betti curves and persistence landscapes to identify spatially variable genes, comparing against Moran's I, sepal, and SpatialDE on simulated and real data. PH-based methods show strong robustness to zero-inflation and noise, offering higher specificity and complementary information to existing approaches, often identifying top SVGs with reliable stability. The approach is scalable through vectorised implementations and provides a flexible framework for integrating topology into spatial omics analyses, with potential applicability to other spatial graph data beyond transcriptomics.

Abstract

Evaluating spatial patterns in data is an integral task across various domains, including geostatistics, astronomy, and spatial tissue biology. The analysis of transcriptomics data in particular relies on methods for detecting spatially-dependent features that exhibit significant spatial patterns for both explanatory analysis and feature selection. However, given the complex and high-dimensional nature of these data, there is a need for robust, stable, and reliable descriptors of spatial dependence. We leverage the stability and multiscale properties of persistent homology to address this task. To this end, we introduce a novel framework using functional topological summaries, such as Betti curves and persistence landscapes, for identifying and describing non-random patterns in spatial data. In particular, we propose a non-parametric one-sample permutation test for spatial dependence and investigate its utility across both simulated and real spatial omics data. Our vectorised approach outperforms baseline methods at accurately detecting spatial dependence. Further, we find that our method is more robust to outliers than alternative tests using Moran's I.

Detecting Spatial Dependence in Transcriptomics Data using Vectorised Persistence Diagrams

TL;DR

This work tackles detecting spatial dependence in spatial transcriptomics by leveraging persistent homology (PH) and functional persistence summaries. It introduces a non-parametric one-sample permutation test based on Betti curves and persistence landscapes to identify spatially variable genes, comparing against Moran's I, sepal, and SpatialDE on simulated and real data. PH-based methods show strong robustness to zero-inflation and noise, offering higher specificity and complementary information to existing approaches, often identifying top SVGs with reliable stability. The approach is scalable through vectorised implementations and provides a flexible framework for integrating topology into spatial omics analyses, with potential applicability to other spatial graph data beyond transcriptomics.

Abstract

Evaluating spatial patterns in data is an integral task across various domains, including geostatistics, astronomy, and spatial tissue biology. The analysis of transcriptomics data in particular relies on methods for detecting spatially-dependent features that exhibit significant spatial patterns for both explanatory analysis and feature selection. However, given the complex and high-dimensional nature of these data, there is a need for robust, stable, and reliable descriptors of spatial dependence. We leverage the stability and multiscale properties of persistent homology to address this task. To this end, we introduce a novel framework using functional topological summaries, such as Betti curves and persistence landscapes, for identifying and describing non-random patterns in spatial data. In particular, we propose a non-parametric one-sample permutation test for spatial dependence and investigate its utility across both simulated and real spatial omics data. Our vectorised approach outperforms baseline methods at accurately detecting spatial dependence. Further, we find that our method is more robust to outliers than alternative tests using Moran's I.
Paper Structure (39 sections, 12 equations, 9 figures, 1 table)

This paper contains 39 sections, 12 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Overview of the proposed analysis pipeline (1) starting from a spatial omics dataset that is converted to a graph, (2) defining a superlevel-set filtration based on the expression values of one gene, (3) computing persistent homology by tracking the evolution of topological features via persistence diagrams, and (4) summarising persistence based on summary statistics or functional summaries.
  • Figure 2: Simulated spatial patterns with varying effect sizes for each spatial domain as indicated in the legend (row 1). Examples of simulated genes with these spatial signals (row 2). The simulated patterns are showing a gradient, a cellring, a set of clusters, and two lines from left to right.
  • Figure 3: Area under the precision-recall curve (AUPRC) across varying degrees of zero inflation for different spatial variability detection methods. Slim lines indicate the standard deviation across 1000 bootstrap re-samples.
  • Figure 4: Sensitivity and specificity using a fixed 0.05 cut-off after multiple testing correction. Slim lines indicate the standard deviation across 1000 bootstrap re-samples.
  • Figure 5: Spearman correlation between the rankings given by different spatial variability detection methods across the four spatial patterns from \ref{['fig:graphs_toy']}. Generally, there is low to medium correlation between methods with the highest agreement being between total persistence and Betti curves. Persistence thus captures distinct characteristics in spatial variability not identified by other approaches.
  • ...and 4 more figures