Table of Contents
Fetching ...

TRACE: Temporal Reasoning via Agentic Context Evolution for Streaming Electronic Health Records (EHRs)

Zhan Qu, Michael Färber

TL;DR

TRACE reframes longitudinal clinical reasoning as continuous context optimization over structured, bounded memory instead of unbounded context expansion or continual fine-tuning. It introduces a dual-memory architecture comprising a frozen Global Protocol of institutional rules and a dynamic Individual Protocol that tracks patient state, coordinated by four agent roles (Router, Reasoner, Auditor, Steward). A separate offline Reflector induces generalizable rules to populate the Global Protocol, after which online inference remains cost-bounded and auditable. Empirical evaluation on MIMIC-IV longitudinal EHR streams shows TRACE improves next-event prediction (Recall@5), enhances protocol adherence, and increases clinical safety compared to long-context, RAG, and monolithic baselines, while producing interpretable reasoning traces. The work demonstrates a practical, interpretable path to robust, test-time decision support with frozen LLMs in high-stakes, long-horizon settings, with potential for broader application beyond healthcare.

Abstract

Large Language Models (LLMs) encode extensive medical knowledge but struggle to apply it reliably to longitudinal patient trajectories, where evolving clinical states, irregular timing, and heterogeneous events degrade performance over time. Existing adaptation strategies rely on fine-tuning or retrieval-based augmentation, which introduce computational overhead, privacy constraints, or instability under long contexts. We introduce TRACE (Temporal Reasoning via Agentic Context Evolution), a framework that enables temporal clinical reasoning with frozen LLMs by explicitly structuring and maintaining context rather than extending context windows or updating parameters. TRACE operates over a dual-memory architecture consisting of a static Global Protocol encoding institutional clinical rules and a dynamic Individual Protocol tracking patient-specific state. Four agentic components, Router, Reasoner, Auditor, and Steward, coordinate over this structured memory to support temporal inference and state evolution. The framework maintains bounded inference cost via structured state compression and selectively audits safety-critical clinical decisions. Evaluated on longitudinal clinical event streams from MIMIC-IV, TRACE significantly improves next-event prediction accuracy, protocol adherence, and clinical safety over long-context and retrieval-augmented baselines, while producing interpretable and auditable reasoning traces.

TRACE: Temporal Reasoning via Agentic Context Evolution for Streaming Electronic Health Records (EHRs)

TL;DR

TRACE reframes longitudinal clinical reasoning as continuous context optimization over structured, bounded memory instead of unbounded context expansion or continual fine-tuning. It introduces a dual-memory architecture comprising a frozen Global Protocol of institutional rules and a dynamic Individual Protocol that tracks patient state, coordinated by four agent roles (Router, Reasoner, Auditor, Steward). A separate offline Reflector induces generalizable rules to populate the Global Protocol, after which online inference remains cost-bounded and auditable. Empirical evaluation on MIMIC-IV longitudinal EHR streams shows TRACE improves next-event prediction (Recall@5), enhances protocol adherence, and increases clinical safety compared to long-context, RAG, and monolithic baselines, while producing interpretable reasoning traces. The work demonstrates a practical, interpretable path to robust, test-time decision support with frozen LLMs in high-stakes, long-horizon settings, with potential for broader application beyond healthcare.

Abstract

Large Language Models (LLMs) encode extensive medical knowledge but struggle to apply it reliably to longitudinal patient trajectories, where evolving clinical states, irregular timing, and heterogeneous events degrade performance over time. Existing adaptation strategies rely on fine-tuning or retrieval-based augmentation, which introduce computational overhead, privacy constraints, or instability under long contexts. We introduce TRACE (Temporal Reasoning via Agentic Context Evolution), a framework that enables temporal clinical reasoning with frozen LLMs by explicitly structuring and maintaining context rather than extending context windows or updating parameters. TRACE operates over a dual-memory architecture consisting of a static Global Protocol encoding institutional clinical rules and a dynamic Individual Protocol tracking patient-specific state. Four agentic components, Router, Reasoner, Auditor, and Steward, coordinate over this structured memory to support temporal inference and state evolution. The framework maintains bounded inference cost via structured state compression and selectively audits safety-critical clinical decisions. Evaluated on longitudinal clinical event streams from MIMIC-IV, TRACE significantly improves next-event prediction accuracy, protocol adherence, and clinical safety over long-context and retrieval-augmented baselines, while producing interpretable and auditable reasoning traces.
Paper Structure (47 sections, 6 equations, 2 figures, 2 tables)

This paper contains 47 sections, 6 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Overview of TRACE.Phase I (Offline): On historical event streams $E_{\le t}$, prediction errors between $\widehat{Y}_t$ and $Y^*_t$ are analyzed by a Reflector agent, which induces generalizable clinical rules that are added to the Global Protocol$\mathcal{P}_G$. Phase II (Online): During deployment, TRACE processes live event bundles $E_t$ using a bounded inference state $S_t = (\mathcal{P}_G, \mathcal{P}_{I,t}, E_{\text{I},t})$. A Router selects relevant rules, a Reasoner predicts the next action, an Auditor conditionally verifies safety-critical decisions, and a Steward updates the patient-specific Individual Protocol$\mathcal{P}_{I,t+1}$. The Global Protocol remains frozen during online inference. All agents use the same LLM backbone.
  • Figure 2: Phase II qualitative example of protocol-grounded bundle execution. With the sepsis protocol fixed, TRACE incrementally predicts the next required bundle as prior actions are observed and incorporated into state.