Towards Multimodal Representation Learning in Paediatric Kidney Disease
Ana Durica, John Booth, Ivana Drobnjak
TL;DR
This study addresses the need for near-term monitoring of renal function in paediatric patients by predicting abnormal serum creatinine within $30$ days using longitudinal EHR data from GOSH. It employs a lightweight temporal representation with a GRU encoder that processes sequences of $100$ time points consisting of 15 laboratory markers (each with presence and abnormality flags) plus demographics, outputting embeddings for a simple $30$-day classifier. Evaluation on a held-out set with a stratified $70/10/20$ split and bootstrap-derived CIs demonstrates good discrimination for the near-term outcome, and embedding visualisation via t-SNE suggests the model captures meaningful predictive structure. The work modularly demonstrates feasibility of temporal modelling in paediatric nephrology and lays groundwork for future multimodal extensions incorporating richer signals and clinically meaningful renal endpoints to enable proactive care.
Abstract
Paediatric kidney disease varies widely in its presentation and progression, which calls for continuous monitoring of renal function. Using electronic health records collected between 2019 and 2025 at Great Ormond Street Hospital, a leading UK paediatric hospital, we explored a temporal modelling approach that integrates longitudinal laboratory sequences with demographic information. A recurrent neural model trained on these data was used to predict whether a child would record an abnormal serum creatinine value within the following thirty days. Framed as a pilot study, this work provides an initial demonstration that simple temporal representations can capture useful patterns in routine paediatric data and lays the groundwork for future multimodal extensions using additional clinical signals and more detailed renal outcomes.
