scASDC: Attention Enhanced Structural Deep Clustering for Single-cell RNA-seq Data
Wenwen Min, Zhen Wang, Fangfang Zhu, Taosheng Xu, Shunfang Wang
TL;DR
The paper tackles the challenge of clustering sparse and noisy scRNA-seq data by proposing scASDC, a deep clustering framework that jointly learns content information from a ZINB-based autoencoder and high-order cell relationships from a graph autoencoder. These two sources are fused using a layer-wise attention mechanism and reinforced by a self-supervised objective, enabling end-to-end clustering. Across six diverse datasets, scASDC outperforms seven baselines in $NMI$ and $ARI$, with ablations confirming the contribution of each module. The approach yields robust cell-type delineation and supports downstream biological interpretation, advancing accurate analysis of cellular heterogeneity in scRNA-seq data.
Abstract
Single-cell RNA sequencing (scRNA-seq) data analysis is pivotal for understanding cellular heterogeneity. However, the high sparsity and complex noise patterns inherent in scRNA-seq data present significant challenges for traditional clustering methods. To address these issues, we propose a deep clustering method, Attention-Enhanced Structural Deep Embedding Graph Clustering (scASDC), which integrates multiple advanced modules to improve clustering accuracy and robustness.Our approach employs a multi-layer graph convolutional network (GCN) to capture high-order structural relationships between cells, termed as the graph autoencoder module. To mitigate the oversmoothing issue in GCNs, we introduce a ZINB-based autoencoder module that extracts content information from the data and learns latent representations of gene expression. These modules are further integrated through an attention fusion mechanism, ensuring effective combination of gene expression and structural information at each layer of the GCN. Additionally, a self-supervised learning module is incorporated to enhance the robustness of the learned embeddings. Extensive experiments demonstrate that scASDC outperforms existing state-of-the-art methods, providing a robust and effective solution for single-cell clustering tasks. Our method paves the way for more accurate and meaningful analysis of single-cell RNA sequencing data, contributing to better understanding of cellular heterogeneity and biological processes. All code and public datasets used in this paper are available at \url{https://github.com/wenwenmin/scASDC} and \url{https://zenodo.org/records/12814320}.
