Continuous Predictive Modeling of Clinical Notes and ICD Codes in Patient Health Records
Mireia Hernandez Caralt, Clarence Boon Liang Ng, Marek Rei
TL;DR
The paper addresses early prediction of final ICD codes from the full EHR note sequence by introducing the Label-Attentive Hierarchical Sequence Transformer (LAHST), which uses causal attention and label-wise hierarchical attention to predict codes at any point during a stay. To handle very long EHR sequences, it augments training with the Extended Context Algorithm (ECA), enabling exact inference over long histories by batching while preserving context. On the MIMIC-III dataset with top-50 ICD-9 codes, LAHST achieves strong early performance, reaching about $82.9\%$ AUC just 2 days after admission, and consistently outperforms baselines across time points. This approach advances predictive medicine by enabling earlier risk assessment, treatment suggestions, and resource planning, while maintaining competitive performance on discharge-summary coding. The work also provides interpretability insights through attention analysis over clinical documents and discusses practical deployment considerations and limitations.
Abstract
Electronic Health Records (EHR) serve as a valuable source of patient information, offering insights into medical histories, treatments, and outcomes. Previous research has developed systems for detecting applicable ICD codes that should be assigned while writing a given EHR document, mainly focusing on discharge summaries written at the end of a hospital stay. In this work, we investigate the potential of predicting these codes for the whole patient stay at different time points during their stay, even before they are officially assigned by clinicians. The development of methods to predict diagnoses and treatments earlier in advance could open opportunities for predictive medicine, such as identifying disease risks sooner, suggesting treatments, and optimizing resource allocation. Our experiments show that predictions regarding final ICD codes can be made already two days after admission and we propose a custom model that improves performance on this early prediction task.
