Table of Contents
Fetching ...

Continuous Predictive Modeling of Clinical Notes and ICD Codes in Patient Health Records

Mireia Hernandez Caralt, Clarence Boon Liang Ng, Marek Rei

TL;DR

The paper addresses early prediction of final ICD codes from the full EHR note sequence by introducing the Label-Attentive Hierarchical Sequence Transformer (LAHST), which uses causal attention and label-wise hierarchical attention to predict codes at any point during a stay. To handle very long EHR sequences, it augments training with the Extended Context Algorithm (ECA), enabling exact inference over long histories by batching while preserving context. On the MIMIC-III dataset with top-50 ICD-9 codes, LAHST achieves strong early performance, reaching about $82.9\%$ AUC just 2 days after admission, and consistently outperforms baselines across time points. This approach advances predictive medicine by enabling earlier risk assessment, treatment suggestions, and resource planning, while maintaining competitive performance on discharge-summary coding. The work also provides interpretability insights through attention analysis over clinical documents and discusses practical deployment considerations and limitations.

Abstract

Electronic Health Records (EHR) serve as a valuable source of patient information, offering insights into medical histories, treatments, and outcomes. Previous research has developed systems for detecting applicable ICD codes that should be assigned while writing a given EHR document, mainly focusing on discharge summaries written at the end of a hospital stay. In this work, we investigate the potential of predicting these codes for the whole patient stay at different time points during their stay, even before they are officially assigned by clinicians. The development of methods to predict diagnoses and treatments earlier in advance could open opportunities for predictive medicine, such as identifying disease risks sooner, suggesting treatments, and optimizing resource allocation. Our experiments show that predictions regarding final ICD codes can be made already two days after admission and we propose a custom model that improves performance on this early prediction task.

Continuous Predictive Modeling of Clinical Notes and ICD Codes in Patient Health Records

TL;DR

The paper addresses early prediction of final ICD codes from the full EHR note sequence by introducing the Label-Attentive Hierarchical Sequence Transformer (LAHST), which uses causal attention and label-wise hierarchical attention to predict codes at any point during a stay. To handle very long EHR sequences, it augments training with the Extended Context Algorithm (ECA), enabling exact inference over long histories by batching while preserving context. On the MIMIC-III dataset with top-50 ICD-9 codes, LAHST achieves strong early performance, reaching about AUC just 2 days after admission, and consistently outperforms baselines across time points. This approach advances predictive medicine by enabling earlier risk assessment, treatment suggestions, and resource planning, while maintaining competitive performance on discharge-summary coding. The work also provides interpretability insights through attention analysis over clinical documents and discusses practical deployment considerations and limitations.

Abstract

Electronic Health Records (EHR) serve as a valuable source of patient information, offering insights into medical histories, treatments, and outcomes. Previous research has developed systems for detecting applicable ICD codes that should be assigned while writing a given EHR document, mainly focusing on discharge summaries written at the end of a hospital stay. In this work, we investigate the potential of predicting these codes for the whole patient stay at different time points during their stay, even before they are officially assigned by clinicians. The development of methods to predict diagnoses and treatments earlier in advance could open opportunities for predictive medicine, such as identifying disease risks sooner, suggesting treatments, and optimizing resource allocation. Our experiments show that predictions regarding final ICD codes can be made already two days after admission and we propose a custom model that improves performance on this early prediction task.
Paper Structure (14 sections, 5 equations, 2 figures, 7 tables, 2 algorithms)

This paper contains 14 sections, 5 equations, 2 figures, 7 tables, 2 algorithms.

Figures (2)

  • Figure 1: LAHST (Label-Attentive Hierarchical Sequence Transformer) architecture. Clinical notes generated throughout the hospital stay are split into chunks. Each chunk is encoded using a pre-trained language model (PLM) to extract the CLS-token embedding. Next, a hierarchical transformer encoder is applied, utilizing causal masking to combine information among past segment embeddings. Finally, the network generates a distinct document representation for each label and temporal point combination and these are then transformed into probabilities by the output layer.
  • Figure 2: Average attention weight per document type at different temporal cut-offs. The LAHST model processes the complete EHR sequence and focuses more on reports of diagnostic tests for early prediction, switching to the discharge summary when it is available.