An Explainable Diagnostic Framework for Neurodegenerative Dementias via Reinforcement-Optimized LLM Reasoning
Andrew Zamai, Nathanael Fijalkow, Boris Mansencal, Laurent Simon, Eloi Navet, Pierrick Coupe
TL;DR
This paper tackles the difficulty of differentiating neurodegenerative dementias using MRI by converting high-resolution brain scans into textual radiology reports and leveraging reasoning-capable LLMs to perform differential diagnosis. A modular pipeline (MRI segmentation, volume ratios, normative atrophy modeling, and radiology report generation) provides semantically meaningful inputs to LLMs, while reinforcement learning with Group Relative Policy Optimization (GRPO) trains lightweight models to produce structured, anatomically grounded diagnostic rationales during inference. The study demonstrates that GRPO-finetuned 8B models can match or surpass larger models in diagnostic accuracy and generate coherent, hypothesis-driven rationales, outperforming traditional classification-only approaches. This work highlights the potential of inference-time reasoning with interpretable explanations to enhance clinical trust and utility in AI-driven neuroimaging diagnostics, offering a practical framework for transparent, data-efficient medical AI. The core mathematical construct, the Structural Deviation Score $SDS$, anchors the qualitative report generation to normative neuroanatomical trajectories, enabling standardized severity mapping across brain regions.$
Abstract
The differential diagnosis of neurodegenerative dementias is a challenging clinical task, mainly because of the overlap in symptom presentation and the similarity of patterns observed in structural neuroimaging. To improve diagnostic efficiency and accuracy, deep learning-based methods such as Convolutional Neural Networks and Vision Transformers have been proposed for the automatic classification of brain MRIs. However, despite their strong predictive performance, these models find limited clinical utility due to their opaque decision making. In this work, we propose a framework that integrates two core components to enhance diagnostic transparency. First, we introduce a modular pipeline for converting 3D T1-weighted brain MRIs into textual radiology reports. Second, we explore the potential of modern Large Language Models (LLMs) to assist clinicians in the differential diagnosis between Frontotemporal dementia subtypes, Alzheimer's disease, and normal aging based on the generated reports. To bridge the gap between predictive accuracy and explainability, we employ reinforcement learning to incentivize diagnostic reasoning in LLMs. Without requiring supervised reasoning traces or distillation from larger models, our approach enables the emergence of structured diagnostic rationales grounded in neuroimaging findings. Unlike post-hoc explainability methods that retrospectively justify model decisions, our framework generates diagnostic rationales as part of the inference process-producing causally grounded explanations that inform and guide the model's decision-making process. In doing so, our framework matches the diagnostic performance of existing deep learning methods while offering rationales that support its diagnostic conclusions.
