Table of Contents
Fetching ...

NEURO-GUARD: Neuro-Symbolic Generalization and Unbiased Adaptive Routing for Diagnostics -- Explainable Medical AI

Midhat Urooj, Ayan Banerjee, Sandeep Gupta

TL;DR

NEURO-GUARD tackles the interpretability and cross-domain generalization gap in medical imaging by fusing Vision Transformers with a knowledge-grounded, LLM-driven reasoning pipeline. It uses retrieval-augmented generation to assemble clinical rules and an LLM-based code synthesis engine to create executable feature detectors, refined through an entropic reinforcement-learning self-verification loop. The approach yields state-of-the-art results on diabetic retinopathy datasets ( boosting accuracy by ~6.2% over ViT baselines and improving cross-domain generalization by ~5%) and strong cross-modal performance in MRI seizure onset zone detection (83.27% AUC), while providing intrinsic, clinically grounded explanations. This work demonstrates that grounding subsymbolic vision in symbolic medical knowledge can produce interpretable, robust diagnostics with practical potential for clinical deployment.

Abstract

Accurate yet interpretable image-based diagnosis remains a central challenge in medical AI, particularly in settings characterized by limited data, subtle visual cues, and high-stakes clinical decision-making. Most existing vision models rely on purely data-driven learning and produce black-box predictions with limited interpretability and poor cross-domain generalization, hindering their real-world clinical adoption. We present NEURO-GUARD, a novel knowledge-guided vision framework that integrates Vision Transformers (ViTs) with language-driven reasoning to improve performance, transparency, and domain robustness. NEURO-GUARD employs a retrieval-augmented generation (RAG) mechanism for self-verification, in which a large language model (LLM) iteratively generates, evaluates, and refines feature-extraction code for medical images. By grounding this process in clinical guidelines and expert knowledge, the framework progressively enhances feature detection and classification beyond purely data-driven baselines. Extensive experiments on diabetic retinopathy classification across four benchmark datasets APTOS, EyePACS, Messidor-1, and Messidor-2 demonstrate that NEURO-GUARD improves accuracy by 6.2% over a ViT-only baseline (84.69% vs. 78.4%) and achieves a 5% gain in domain generalization. Additional evaluations on MRI-based seizure detection further confirm its cross-domain robustness, consistently outperforming existing methods. Overall, NEURO-GUARD bridges symbolic medical reasoning with subsymbolic visual learning, enabling interpretable, knowledge-aware, and generalizable medical image diagnosis while achieving state-of-the-art performance across multiple datasets.

NEURO-GUARD: Neuro-Symbolic Generalization and Unbiased Adaptive Routing for Diagnostics -- Explainable Medical AI

TL;DR

NEURO-GUARD tackles the interpretability and cross-domain generalization gap in medical imaging by fusing Vision Transformers with a knowledge-grounded, LLM-driven reasoning pipeline. It uses retrieval-augmented generation to assemble clinical rules and an LLM-based code synthesis engine to create executable feature detectors, refined through an entropic reinforcement-learning self-verification loop. The approach yields state-of-the-art results on diabetic retinopathy datasets ( boosting accuracy by ~6.2% over ViT baselines and improving cross-domain generalization by ~5%) and strong cross-modal performance in MRI seizure onset zone detection (83.27% AUC), while providing intrinsic, clinically grounded explanations. This work demonstrates that grounding subsymbolic vision in symbolic medical knowledge can produce interpretable, robust diagnostics with practical potential for clinical deployment.

Abstract

Accurate yet interpretable image-based diagnosis remains a central challenge in medical AI, particularly in settings characterized by limited data, subtle visual cues, and high-stakes clinical decision-making. Most existing vision models rely on purely data-driven learning and produce black-box predictions with limited interpretability and poor cross-domain generalization, hindering their real-world clinical adoption. We present NEURO-GUARD, a novel knowledge-guided vision framework that integrates Vision Transformers (ViTs) with language-driven reasoning to improve performance, transparency, and domain robustness. NEURO-GUARD employs a retrieval-augmented generation (RAG) mechanism for self-verification, in which a large language model (LLM) iteratively generates, evaluates, and refines feature-extraction code for medical images. By grounding this process in clinical guidelines and expert knowledge, the framework progressively enhances feature detection and classification beyond purely data-driven baselines. Extensive experiments on diabetic retinopathy classification across four benchmark datasets APTOS, EyePACS, Messidor-1, and Messidor-2 demonstrate that NEURO-GUARD improves accuracy by 6.2% over a ViT-only baseline (84.69% vs. 78.4%) and achieves a 5% gain in domain generalization. Additional evaluations on MRI-based seizure detection further confirm its cross-domain robustness, consistently outperforming existing methods. Overall, NEURO-GUARD bridges symbolic medical reasoning with subsymbolic visual learning, enabling interpretable, knowledge-aware, and generalizable medical image diagnosis while achieving state-of-the-art performance across multiple datasets.

Paper Structure

This paper contains 24 sections, 6 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Performance comparison of existing models versus the NEURO-GUARD framework for 5-stage Diabetic Retinopathy classification.
  • Figure 2: Overview of the NEURO-GUARD framework. The system integrates medical knowledge with multimodal imaging to enhance disease classification and provide clinically aligned, interpretable explanations with spatial localization.
  • Figure 3: NEURO-GUARD Framework for Knowledge-Driven Medical AI. The NEURO-GUARD pipeline integrates RAG-based knowledge extraction, reinforcement learning-based self-verification, and multi-class classification.
  • Figure 4: Flow diagram of the NEURO-GUARD framework