GRAIL: Geometry-Aware Retrieval-Augmented Inference with LLMs over Hyperbolic Representations of Patient Trajectories
Zhan Qu, Michael Färber
TL;DR
GRAIL tackles next-visit prediction from longitudinal, multi-modal EHRs by integrating deterministic coding hierarchies with data-driven cross-modal temporal associations and embedding them in hyperbolic space. It denoises sparse visits into Central Events and employs structure-aware retrieval to assemble a Typed Risk Horizon, optionally refined by an LLM at inference time. The approach leverages hyperbolic geometry to capture hierarchical and directional relationships, while retrieval constrains the output space to clinically plausible candidates, reducing hallucinations. Experiments on MIMIC-IV show consistent improvements in multi-modal next-visit prediction and more hierarchy-consistent forecasts, demonstrating the practical value of geometry-aware, retrieval-grounded reasoning for healthcare trajectories.
Abstract
Predicting future clinical events from longitudinal electronic health records (EHRs) is challenging due to sparse multi-type clinical events, hierarchical medical vocabularies, and the tendency of large language models (LLMs) to hallucinate when reasoning over long structured histories. We study next-visit event prediction, which aims to forecast a patient's upcoming clinical events based on prior visits. We propose GRAIL, a framework that models longitudinal EHRs using structured geometric representations and structure-aware retrieval. GRAIL constructs a unified clinical graph by combining deterministic coding-system hierarchies with data-driven temporal associations across event types, embeds this graph in hyperbolic space, and summarizes each visit as a probabilistic Central Event that denoises sparse observations. At inference time, GRAIL retrieves a structured set of clinically plausible future events aligned with hierarchical and temporal progression, and optionally refines their ranking using an LLM as a constrained inference-time reranker. Experiments on MIMIC-IV show that GRAIL consistently improves multi-type next-visit prediction and yields more hierarchy-consistent forecasts.
