Multi-Slice Spatial Transcriptomics Data Integration Analysis with STG3Net
Donghai Fang, Fangfang Zhu, Wenwen Min
TL;DR
STG3Net tackles batch effects in multi-slice spatial transcriptomics by integrating a masked graph autoencoder backbone with adversarial learning and a novel Global Nearest Neighbor (G2N) anchor-pair triplet mechanism. The framework jointly learns a robust latent space for cross-slice spatial domain identification and batch correction, aided by data augmentation, per-slice adjacency graphs, and a block-diagonal global graph. Key contributions include the plug-and-play G2N method, a masked self-supervised encoder, and comprehensive evaluations on three platform-diverse datasets (DLPFC, AMB, ME), along with thorough ablations that validate each component and objective function. STG3Net demonstrates superior accuracy, consistency, and batch correction (F1_LISI) while preserving biological variability and connectivity across slices, enabling more reliable cross-slice spatial analyses in SRT studies.
Abstract
With the rapid development of the latest Spatially Resolved Transcriptomics (SRT) technology, which allows for the mapping of gene expression within tissue sections, the integrative analysis of multiple SRT data has become increasingly important. However, batch effects between multiple slices pose significant challenges in analyzing SRT data. To address these challenges, we have developed a plug-and-play batch correction method called Global Nearest Neighbor (G2N) anchor pairs selection. G2N effectively mitigates batch effects by selecting representative anchor pairs across slices. Building upon G2N, we propose STG3Net, which cleverly combines masked graph convolutional autoencoders as backbone modules. These autoencoders, integrated with generative adversarial learning, enable STG3Net to achieve robust multi-slice spatial domain identification and batch correction. We comprehensively evaluate the feasibility of STG3Net on three multiple SRT datasets from different platforms, considering accuracy, consistency, and the F1LISI metric (a measure of batch effect correction efficiency). Compared to existing methods, STG3Net achieves the best overall performance while preserving the biological variability and connectivity between slices. Source code and all public datasets used in this paper are available at https://github.com/wenwenmin/STG3Net and https://zenodo.org/records/12737170.
