Grounded by Experience: Generative Healthcare Prediction Augmented with Hierarchical Agentic Retrieval

Chuang Zhao; Hui Tang; Hongke Zhao; Xiaofang Zhou; Xiaomeng Li

Grounded by Experience: Generative Healthcare Prediction Augmented with Hierarchical Agentic Retrieval

Chuang Zhao, Hui Tang, Hongke Zhao, Xiaofang Zhou, Xiaomeng Li

TL;DR

GHAR addresses hallucination and retrieval-activation timing in healthcare prediction by introducing a hierarchical agentic RAG framework with dual agents that iteratively decide when to retrieve and how to retrieve content. The system unifies two agents within a Markov Decision Process and optimizes them with multi-agent reinforcement learning, guided by a diverse reward structure that aligns reasoning efficiency, retrieval relevance, and final accuracy. Using meta-path partitions to constrain GraphRAG, GHAR demonstrates superior performance on three healthcare benchmarks (DEC Pred, READ Pred, LOS Pred) across MIMIC-III, MIMIC-IV, and eICU, with robust ablations, OOD evaluations, and semantic QA demonstrations. The work advances practical healthcare AI by enabling dynamic, explainable, and scalable augmentation of LLM predictions with targeted external knowledge, potentially reducing hallucinations and improving generalization in clinical decision support.

Abstract

Accurate healthcare prediction is critical for improving patient outcomes and reducing operational costs. Bolstered by growing reasoning capabilities, large language models (LLMs) offer a promising path to enhance healthcare predictions by drawing on their rich parametric knowledge. However, LLMs are prone to factual inaccuracies due to limitations in the reliability and coverage of their embedded knowledge. While retrieval-augmented generation (RAG) frameworks, such as GraphRAG and its variants, have been proposed to mitigate these issues by incorporating external knowledge, they face two key challenges in the healthcare scenario: (1) identifying the clinical necessity to activate the retrieval mechanism, and (2) achieving synergy between the retriever and the generator to craft contextually appropriate retrievals. To address these challenges, we propose GHAR, a \underline{g}enerative \underline{h}ierarchical \underline{a}gentic \underline{R}AG framework that simultaneously resolves when to retrieve and how to optimize the collaboration between submodules in healthcare. Specifically, for the first challenge, we design a dual-agent architecture comprising Agent-Top and Agent-Low. Agent-Top acts as the primary physician, iteratively deciding whether to rely on parametric knowledge or to initiate retrieval, while Agent-Low acts as the consulting service, summarising all task-relevant knowledge once retrieval was triggered. To tackle the second challenge, we innovatively unify the optimization of both agents within a formal Markov Decision Process, designing diverse rewards to align their shared goal of accurate prediction while preserving their distinct roles. Extensive experiments on three benchmark datasets across three popular tasks demonstrate our superiority over state-of-the-art baselines, highlighting the potential of hierarchical agentic RAG in advancing healthcare systems.

Grounded by Experience: Generative Healthcare Prediction Augmented with Hierarchical Agentic Retrieval

TL;DR

Abstract

Grounded by Experience: Generative Healthcare Prediction Augmented with Hierarchical Agentic Retrieval

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)