DiagnoLLM: A Hybrid Bayesian Neural Language Framework for Interpretable Disease Diagnosis
Bowen Xu, Xinyue Zeng, Jiazhen Hu, Tuo Wang, Adithya Kulkarni
TL;DR
DiagnoLLM addresses the need for interpretable, cell-type-aware disease diagnosis from bulk transcriptomics by integrating GP-unmix CTS deconvolution, eQTL-guided neural prediction, and an LLM-based post-hoc interpretation module. GP-unmix provides uncertainty-aware CTS expression, while eQTL priors orient the classifier toward regulatory biology, achieving robust AD prediction (88.0% accuracy). The LLM translates predictions and feature attributions into audience-specific narratives grounded in domain knowledge, validated through case studies and a simulated user study that show actionable, trusted explanations. Overall, the framework demonstrates that a neuro-symbolic pipeline can deliver reliable, interpretable clinical diagnostics without sacrificing predictive performance.
Abstract
Building trustworthy clinical AI systems requires not only accurate predictions but also transparent, biologically grounded explanations. We present \texttt{DiagnoLLM}, a hybrid framework that integrates Bayesian deconvolution, eQTL-guided deep learning, and LLM-based narrative generation for interpretable disease diagnosis. DiagnoLLM begins with GP-unmix, a Gaussian Process-based hierarchical model that infers cell-type-specific gene expression profiles from bulk and single-cell RNA-seq data while modeling biological uncertainty. These features, combined with regulatory priors from eQTL analysis, power a neural classifier that achieves high predictive performance in Alzheimer's Disease (AD) detection (88.0\% accuracy). To support human understanding and trust, we introduce an LLM-based reasoning module that translates model outputs into audience-specific diagnostic reports, grounded in clinical features, attribution signals, and domain knowledge. Human evaluations confirm that these reports are accurate, actionable, and appropriately tailored for both physicians and patients. Our findings show that LLMs, when deployed as post-hoc reasoners rather than end-to-end predictors, can serve as effective communicators within hybrid diagnostic pipelines.
