Table of Contents
Fetching ...

DiagnoLLM: A Hybrid Bayesian Neural Language Framework for Interpretable Disease Diagnosis

Bowen Xu, Xinyue Zeng, Jiazhen Hu, Tuo Wang, Adithya Kulkarni

TL;DR

DiagnoLLM addresses the need for interpretable, cell-type-aware disease diagnosis from bulk transcriptomics by integrating GP-unmix CTS deconvolution, eQTL-guided neural prediction, and an LLM-based post-hoc interpretation module. GP-unmix provides uncertainty-aware CTS expression, while eQTL priors orient the classifier toward regulatory biology, achieving robust AD prediction (88.0% accuracy). The LLM translates predictions and feature attributions into audience-specific narratives grounded in domain knowledge, validated through case studies and a simulated user study that show actionable, trusted explanations. Overall, the framework demonstrates that a neuro-symbolic pipeline can deliver reliable, interpretable clinical diagnostics without sacrificing predictive performance.

Abstract

Building trustworthy clinical AI systems requires not only accurate predictions but also transparent, biologically grounded explanations. We present \texttt{DiagnoLLM}, a hybrid framework that integrates Bayesian deconvolution, eQTL-guided deep learning, and LLM-based narrative generation for interpretable disease diagnosis. DiagnoLLM begins with GP-unmix, a Gaussian Process-based hierarchical model that infers cell-type-specific gene expression profiles from bulk and single-cell RNA-seq data while modeling biological uncertainty. These features, combined with regulatory priors from eQTL analysis, power a neural classifier that achieves high predictive performance in Alzheimer's Disease (AD) detection (88.0\% accuracy). To support human understanding and trust, we introduce an LLM-based reasoning module that translates model outputs into audience-specific diagnostic reports, grounded in clinical features, attribution signals, and domain knowledge. Human evaluations confirm that these reports are accurate, actionable, and appropriately tailored for both physicians and patients. Our findings show that LLMs, when deployed as post-hoc reasoners rather than end-to-end predictors, can serve as effective communicators within hybrid diagnostic pipelines.

DiagnoLLM: A Hybrid Bayesian Neural Language Framework for Interpretable Disease Diagnosis

TL;DR

DiagnoLLM addresses the need for interpretable, cell-type-aware disease diagnosis from bulk transcriptomics by integrating GP-unmix CTS deconvolution, eQTL-guided neural prediction, and an LLM-based post-hoc interpretation module. GP-unmix provides uncertainty-aware CTS expression, while eQTL priors orient the classifier toward regulatory biology, achieving robust AD prediction (88.0% accuracy). The LLM translates predictions and feature attributions into audience-specific narratives grounded in domain knowledge, validated through case studies and a simulated user study that show actionable, trusted explanations. Overall, the framework demonstrates that a neuro-symbolic pipeline can deliver reliable, interpretable clinical diagnostics without sacrificing predictive performance.

Abstract

Building trustworthy clinical AI systems requires not only accurate predictions but also transparent, biologically grounded explanations. We present \texttt{DiagnoLLM}, a hybrid framework that integrates Bayesian deconvolution, eQTL-guided deep learning, and LLM-based narrative generation for interpretable disease diagnosis. DiagnoLLM begins with GP-unmix, a Gaussian Process-based hierarchical model that infers cell-type-specific gene expression profiles from bulk and single-cell RNA-seq data while modeling biological uncertainty. These features, combined with regulatory priors from eQTL analysis, power a neural classifier that achieves high predictive performance in Alzheimer's Disease (AD) detection (88.0\% accuracy). To support human understanding and trust, we introduce an LLM-based reasoning module that translates model outputs into audience-specific diagnostic reports, grounded in clinical features, attribution signals, and domain knowledge. Human evaluations confirm that these reports are accurate, actionable, and appropriately tailored for both physicians and patients. Our findings show that LLMs, when deployed as post-hoc reasoners rather than end-to-end predictors, can serve as effective communicators within hybrid diagnostic pipelines.

Paper Structure

This paper contains 41 sections, 2 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Overview of the DiagnoLLM framework. Stage 1 (GP-Unmix) performs Bayesian deconvolution of bulk RNA-seq into CTS expression using single-cell references. Stage 2 combines eQTL-informed DL predictions with LLM-based reasoning to produce human-readable diagnostic reports, linking model outputs with clinical interpretability.
  • Figure 2: GP-unmix improves CTS recovery over TCA and bMIND across neuron subtypes in the Tasic dataset tasic2018shared.
  • Figure 3: GP-unmix outperforms baselines on astrocytes, microglia, and inhibitory subtypes in the Yao dataset yao2021transcriptomic.