MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus

Zheng Li; Jiayi Xu; Zhikai Hu; Hechang Chen; Lele Cong; Yunyun Wang; Shuchao Pang

MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus

Zheng Li, Jiayi Xu, Zhikai Hu, Hechang Chen, Lele Cong, Yunyun Wang, Shuchao Pang

TL;DR

Experimental results on hepatic disease cases from MIMIC-IV show that MedCoRAG outperforms existing methods and closed-source models in both diagnostic performance and reasoning interpretability.

Abstract

Diagnosing hepatic diseases accurately and interpretably is critical, yet it remains challenging in real-world clinical settings. Existing AI approaches for clinical diagnosis often lack transparency, structured reasoning, and deployability. Recent efforts have leveraged large language models (LLMs), retrieval-augmented generation (RAG), and multi-agent collaboration. However, these approaches typically retrieve evidence from a single source and fail to support iterative, role-specialized deliberation grounded in structured clinical data. To address this, we propose MedCoRAG (i.e., Medical Collaborative RAG), an end-to-end framework that generates diagnostic hypotheses from standardized abnormal findings and constructs a patient-specific evidence package by jointly retrieving and pruning UMLS knowledge graph paths and clinical guidelines. It then performs Multi-Agent Collaborative Reasoning: a Router Agent dynamically dispatches Specialist Agents based on case complexity; these agents iteratively reason over the evidence and trigger targeted re-retrievals when needed, while a Generalist Agent synthesizes all deliberations into a traceable consensus diagnosis that emulates multidisciplinary consultation. Experimental results on hepatic disease cases from MIMIC-IV show that MedCoRAG outperforms existing methods and closed-source models in both diagnostic performance and reasoning interpretability.

MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus

TL;DR

Experimental results on hepatic disease cases from MIMIC-IV show that MedCoRAG outperforms existing methods and closed-source models in both diagnostic performance and reasoning interpretability.

Abstract

Paper Structure (32 sections, 14 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 32 sections, 14 equations, 5 figures, 4 tables, 1 algorithm.

Introduction
Related Work
Medical Retrieval and Knowledge-Augmented Reasoning
Multi-Agent Systems for Clinical Collaboration
Methodology
Overall Architecture
Core Components
Abnormal Findings and Preliminary Diagnosis
Abnormal Entity Recognition and Standardization
Direct Generation of Candidate Diagnoses
Hybrid RAG
Clinical Guideline Retrieval and Relevance Filtering
Knowledge Graph Path Retrieval and Guideline-Informed Pruning
Multi-Agent Collaborative Reasoning
Complexity Assessment
...and 17 more sections

Figures (5)

Figure 1: Comparative Overview of Medical Diagnostic Reasoning Frameworks
Figure 2: Overall architecture of the MedCoRAG framework, comprising three core components. (1) Abnormal Findings and Preliminary Diagnosis: Abnormal clinical findings are extracted from the patient narrative and standardized via UMLS to generate a focused set of initial diagnostic hypotheses. (2) Hybrid RAG: For each hypothesis, the system retrieves clinical guideline excerpts and UMLS knowledge graph paths, then prunes them using the full clinical context to form a coherent, patient-specific evidence package. (3) Multi-Agent Collaborative Reasoning: A Router Agent assesses case complexity to either activate relevant specialist agents or delegate simple cases to the Generalist Agent; all agents iteratively reason over the shared evidence, trigger re-retrieval when needed, and converge on an interpretable consensus diagnosis through the Generalist Agent.
Figure 3: Confusion matrix of MedCoRAG on 13 hepatic disease classes.
Figure 4: Average number of abnormal entities per case across different hepatic diseases. Higher values indicate more complex clinical presentations.
Figure 5: Average number of hops in knowledge graph paths used during diagnosis. Higher values reflect greater reasoning complexity.

MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus

TL;DR

Abstract

MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus

Authors

TL;DR

Abstract

Table of Contents

Figures (5)