Explainable AI model reveals disease-related mechanisms in single-cell RNA-seq data
Mohammad Usman, Olga Varea, Petia Radeva, Josep Canals, Jordi Abante, Daniel Ortiz
TL;DR
This study addresses the challenge of interpreting neurodegenerative disease mechanisms from single-cell data by integrating a neural-network classifier with SHAP-based explainable AI to identify HD-associated genes at single-cell resolution. The approach compares SHAP-informed gene importance with traditional DESeq2 differential expression, followed by GSEA to reveal affected pathways in direct- and indirect-pathway SPNs. Results show both overlap and divergence between the methods, with SHAP uncovering additional HD-relevant genes and pathways not captured by DGE alone, thereby offering a broader mechanistic view. The framework demonstrates the value of XAI for extracting actionable, cell-type–specific insights from single-cell transcriptomics and can be extended to multi-omics and other diseases.
Abstract
Neurodegenerative diseases (NDDs) are complex and lack effective treatment due to their poorly understood mechanism. The increasingly used data analysis from Single nucleus RNA Sequencing (snRNA-seq) allows to explore transcriptomic events at a single cell level, yet face challenges in interpreting the mechanisms underlying a disease. On the other hand, Neural Network (NN) models can handle complex data to offer insights but can be seen as black boxes with poor interpretability. In this context, explainable AI (XAI) emerges as a solution that could help to understand disease-associated mechanisms when combined with efficient NN models. However, limited research explores XAI in single-cell data. In this work, we implement a method for identifying disease-related genes and the mechanistic explanation of disease progression based on NN model combined with SHAP. We analyze available Huntington's disease (HD) data to identify both HD-altered genes and mechanisms by adding Gene Set Enrichment Analysis (GSEA) comparing two methods, differential gene expression analysis (DGE) and NN combined with SHAP approach. Our results show that DGE and SHAP approaches offer both common and differential sets of altered genes and pathways, reinforcing the usefulness of XAI methods for a broader perspective of disease.
