Table of Contents
Fetching ...

Traj-CoA: Patient Trajectory Modeling via Chain-of-Agents for Lung Cancer Risk Prediction

Sihang Zeng, Yujuan Fu, Sitong Zhou, Zixuan Yu, Lucas Jing Liu, Jun Wen, Matthew Thompson, Ruth Etzioni, Meliha Yetisgen

TL;DR

Traj-CoA introduces a memory-augmented chain-of-agents framework to enable robust temporal reasoning over ultra-long, noisy electronic health records for patient trajectory modeling. By splitting long histories into time-aware chunks processed by worker agents and synthesized by a manager agent, with an external memory (EHRMem) preserving a distilled clinical timeline, the approach mitigates the lost-in-the-middle problem and information forgetting. In a zero-shot lung cancer risk prediction task, Traj-CoA outperforms conventional ML, DL, BERT, and vanilla LLM baselines and rivals fine-tuned models, while analyses show clinically aligned reasoning and meaningful event themes. The framework offers a generalizable solution for longitudinal EHR tasks, with potential for broader clinical adoption pending multi-site validation and further enhancements such as knowledge integration and agent-specific fine-tuning.

Abstract

Large language models (LLMs) offer a generalizable approach for modeling patient trajectories, but suffer from the long and noisy nature of electronic health records (EHR) data in temporal reasoning. To address these challenges, we introduce Traj-CoA, a multi-agent system involving chain-of-agents for patient trajectory modeling. Traj-CoA employs a chain of worker agents to process EHR data in manageable chunks sequentially, distilling critical events into a shared long-term memory module, EHRMem, to reduce noise and preserve a comprehensive timeline. A final manager agent synthesizes the worker agents' summary and the extracted timeline in EHRMem to make predictions. In a zero-shot one-year lung cancer risk prediction task based on five-year EHR data, Traj-CoA outperforms baselines of four categories. Analysis reveals that Traj-CoA exhibits clinically aligned temporal reasoning, establishing it as a promisingly robust and generalizable approach for modeling complex patient trajectories.

Traj-CoA: Patient Trajectory Modeling via Chain-of-Agents for Lung Cancer Risk Prediction

TL;DR

Traj-CoA introduces a memory-augmented chain-of-agents framework to enable robust temporal reasoning over ultra-long, noisy electronic health records for patient trajectory modeling. By splitting long histories into time-aware chunks processed by worker agents and synthesized by a manager agent, with an external memory (EHRMem) preserving a distilled clinical timeline, the approach mitigates the lost-in-the-middle problem and information forgetting. In a zero-shot lung cancer risk prediction task, Traj-CoA outperforms conventional ML, DL, BERT, and vanilla LLM baselines and rivals fine-tuned models, while analyses show clinically aligned reasoning and meaningful event themes. The framework offers a generalizable solution for longitudinal EHR tasks, with potential for broader clinical adoption pending multi-site validation and further enhancements such as knowledge integration and agent-specific fine-tuning.

Abstract

Large language models (LLMs) offer a generalizable approach for modeling patient trajectories, but suffer from the long and noisy nature of electronic health records (EHR) data in temporal reasoning. To address these challenges, we introduce Traj-CoA, a multi-agent system involving chain-of-agents for patient trajectory modeling. Traj-CoA employs a chain of worker agents to process EHR data in manageable chunks sequentially, distilling critical events into a shared long-term memory module, EHRMem, to reduce noise and preserve a comprehensive timeline. A final manager agent synthesizes the worker agents' summary and the extracted timeline in EHRMem to make predictions. In a zero-shot one-year lung cancer risk prediction task based on five-year EHR data, Traj-CoA outperforms baselines of four categories. Analysis reveals that Traj-CoA exhibits clinically aligned temporal reasoning, establishing it as a promisingly robust and generalizable approach for modeling complex patient trajectories.

Paper Structure

This paper contains 32 sections, 3 equations, 3 figures, 14 tables.

Figures (3)

  • Figure 1: Traj-CoA architecture consisting of a chain of worker agents, a manager agent, and EHRMem.
  • Figure 2: Sensitivity analysis on (A) chunk size and (B) number of chunks.
  • Figure 3: Analysis of Traj-CoA's behavior. (A) t-SNE plot visualizing the distribution of lung cancer related events in all cases' output $O$ and sample events (timestamps in sample events are omitted for de-identification purposes). The events were embedded through nomic-embed-text-v1.5 nussbaum2025nomicembedtrainingreproducible; (B) Distribution of categories in the lung cancer related events; and (C) Normalized date distribution of the events.