Investigating potential causes of Sepsis with Bayesian network structure learning
Bruno Petrungaro, Neville K. Kitson, Anthony C. Constantinou
TL;DR
The paper tackles identifying causal drivers of Sepsis by learning Bayesian network structures from NHS England data and integrating clinical expertise with diverse structure-learning algorithms. It introduces a model-averaging approach and knowledge-based constraints to build a causally informed model suitable for prediction, intervention simulation, and policy guidance. Key findings show that COPD, Alcohol dependence, and Diabetes influence Sepsis risk and that interventions on these factors can meaningfully reduce Sepsis probability; predictive performance achieves AUC ~0.80 with ~70% accuracy. The work demonstrates a practical framework for policy-relevant causal inference in healthcare using routinely collected data, while acknowledging data limitations and the value of combining data-driven insights with expert knowledge.
Abstract
Sepsis is a life-threatening and serious global health issue. This study combines knowledge with available hospital data to investigate the potential causes of Sepsis that can be affected by policy decisions. We investigate the underlying causal structure of this problem by combining clinical expertise with score-based, constraint-based, and hybrid structure learning algorithms. A novel approach to model averaging and knowledge-based constraints was implemented to arrive at a consensus structure for causal inference. The structure learning process highlighted the importance of exploring data-driven approaches alongside clinical expertise. This includes discovering unexpected, although reasonable, relationships from a clinical perspective. Hypothetical interventions on Chronic Obstructive Pulmonary Disease, Alcohol dependence, and Diabetes suggest that the presence of any of these risk factors in patients increases the likelihood of Sepsis. This finding, alongside measuring the effect of these risk factors on Sepsis, has potential policy implications. Recognising the importance of prediction in improving health outcomes related to Sepsis, the model is also assessed in its ability to predict Sepsis by evaluating accuracy, sensitivity, and specificity. These three indicators all had results around 70%, and the AUC was 80%, which means the causal structure of the model is reasonably accurate given that the models were trained on data available for commissioning purposes only.
