TRACE: Temporal Reasoning via Agentic Context Evolution for Streaming Electronic Health Records (EHRs)
Zhan Qu, Michael Färber
TL;DR
TRACE reframes longitudinal clinical reasoning as continuous context optimization over structured, bounded memory instead of unbounded context expansion or continual fine-tuning. It introduces a dual-memory architecture comprising a frozen Global Protocol of institutional rules and a dynamic Individual Protocol that tracks patient state, coordinated by four agent roles (Router, Reasoner, Auditor, Steward). A separate offline Reflector induces generalizable rules to populate the Global Protocol, after which online inference remains cost-bounded and auditable. Empirical evaluation on MIMIC-IV longitudinal EHR streams shows TRACE improves next-event prediction (Recall@5), enhances protocol adherence, and increases clinical safety compared to long-context, RAG, and monolithic baselines, while producing interpretable reasoning traces. The work demonstrates a practical, interpretable path to robust, test-time decision support with frozen LLMs in high-stakes, long-horizon settings, with potential for broader application beyond healthcare.
Abstract
Large Language Models (LLMs) encode extensive medical knowledge but struggle to apply it reliably to longitudinal patient trajectories, where evolving clinical states, irregular timing, and heterogeneous events degrade performance over time. Existing adaptation strategies rely on fine-tuning or retrieval-based augmentation, which introduce computational overhead, privacy constraints, or instability under long contexts. We introduce TRACE (Temporal Reasoning via Agentic Context Evolution), a framework that enables temporal clinical reasoning with frozen LLMs by explicitly structuring and maintaining context rather than extending context windows or updating parameters. TRACE operates over a dual-memory architecture consisting of a static Global Protocol encoding institutional clinical rules and a dynamic Individual Protocol tracking patient-specific state. Four agentic components, Router, Reasoner, Auditor, and Steward, coordinate over this structured memory to support temporal inference and state evolution. The framework maintains bounded inference cost via structured state compression and selectively audits safety-critical clinical decisions. Evaluated on longitudinal clinical event streams from MIMIC-IV, TRACE significantly improves next-event prediction accuracy, protocol adherence, and clinical safety over long-context and retrieval-augmented baselines, while producing interpretable and auditable reasoning traces.
