LUME-DBN: Full Bayesian Learning of DBNs from Incomplete data in Intensive Care
Federico Pirola, Fabio Stella, Marco Grzegorczyk
TL;DR
The paper tackles learning temporal dependencies in ICU time-series with frequent missing data by proposing LUME-DBN, a full Bayesian DBN learning framework with a Gibbs sampling imputation step. It models each variable with a Bayesian linear regression over a one-slice lag, derives tractable full conditional distributions for missing values, and jointly learns structure and parameters while imputing data. Across synthetic experiments and a PhysioNet ICU case study, LUME-DBN achieves superior network reconstruction (higher AUC-PR) and provides explicit uncertainty quantification for both missing data and network structure, outperforming model-agnostic baselines like MICE and Temporal MICE. The approach enhances clinical decision support by yielding safer imputations and more reliable temporal inferences, with clear avenues for extensions to MNAR, non-homogeneous, and expert-informed DBNs.
Abstract
Dynamic Bayesian networks (DBNs) are increasingly used in healthcare due to their ability to model complex temporal relationships in patient data while maintaining interpretability, an essential feature for clinical decision-making. However, existing approaches to handling missing data in longitudinal clinical datasets are largely derived from static Bayesian networks literature, failing to properly account for the temporal nature of the data. This gap limits the ability to quantify uncertainty over time, which is particularly critical in settings such as intensive care, where understanding the temporal dynamics is fundamental for model trustworthiness and applicability across diverse patient groups. Despite the potential of DBNs, a full Bayesian framework that integrates missing data handling remains underdeveloped. In this work, we propose a novel Gibbs sampling-based method for learning DBNs from incomplete data. Our method treats each missing value as an unknown parameter following a Gaussian distribution. At each iteration, the unobserved values are sampled from their full conditional distributions, allowing for principled imputation and uncertainty estimation. We evaluate our method on both simulated datasets and real-world intensive care data from critically ill patients. Compared to standard model-agnostic techniques such as MICE, our Bayesian approach demonstrates superior reconstruction accuracy and convergence properties. These results highlight the clinical relevance of incorporating full Bayesian inference in temporal models, providing more reliable imputations and offering deeper insight into model behavior. Our approach supports safer and more informed clinical decision-making, particularly in settings where missing data are frequent and potentially impactful.
