Foundation models for electronic health records: representation dynamics and transferability
Michael C. Burkhart, Bashar Ramadan, Zewei Liao, Kaveri Chhikara, Juan C. Rojas, William F. Parker, Brett K. Beaulieu-Jones
TL;DR
This study investigates the transferability of foundation models trained on MIMIC-IV EHR data to a different health system (UCMC) by examining representation dynamics, outlier detection, and outcome-specific fine-tuning. The authors train a 1B-parameter FM with a self-supervised next-token objective, extract 24-hour representations, and evaluate both representation-based classifiers and supervised fine-tuning for four clinically relevant outcomes. They find substantial cross-site degradation without adaptation, but demonstrate that fine-tuning—especially with some local target-domain data—substantially improves performance, particularly for ICU admission and IMV prediction. Across datasets, representation trajectory features correlate with adverse outcomes, suggesting that analyzing clinical latent-space dynamics can inform early risk stratification and model deployment in diverse healthcare settings.
Abstract
Foundation models (FMs) trained on electronic health records (EHRs) have shown strong performance on a range of clinical prediction tasks. However, adapting these models to local health systems remains challenging due to limited data availability and resource constraints. In this study, we investigated what these models learn and evaluated the transferability of an FM trained on MIMIC-IV to an institutional EHR dataset at the University of Chicago Medical Center. We assessed their ability to identify outlier patients and examined representation-space patient trajectories in relation to future clinical outcomes. We also evaluated the performance of supervised fine-tuned classifiers on both source and target datasets. Our findings offer insights into the adaptability of FMs across different healthcare systems, highlight considerations for their effective implementation, and provide an empirical analysis of the underlying factors that contribute to their predictive performance.
