Table of Contents
Fetching ...

Investigating potential causes of Sepsis with Bayesian network structure learning

Bruno Petrungaro, Neville K. Kitson, Anthony C. Constantinou

TL;DR

The paper tackles identifying causal drivers of Sepsis by learning Bayesian network structures from NHS England data and integrating clinical expertise with diverse structure-learning algorithms. It introduces a model-averaging approach and knowledge-based constraints to build a causally informed model suitable for prediction, intervention simulation, and policy guidance. Key findings show that COPD, Alcohol dependence, and Diabetes influence Sepsis risk and that interventions on these factors can meaningfully reduce Sepsis probability; predictive performance achieves AUC ~0.80 with ~70% accuracy. The work demonstrates a practical framework for policy-relevant causal inference in healthcare using routinely collected data, while acknowledging data limitations and the value of combining data-driven insights with expert knowledge.

Abstract

Sepsis is a life-threatening and serious global health issue. This study combines knowledge with available hospital data to investigate the potential causes of Sepsis that can be affected by policy decisions. We investigate the underlying causal structure of this problem by combining clinical expertise with score-based, constraint-based, and hybrid structure learning algorithms. A novel approach to model averaging and knowledge-based constraints was implemented to arrive at a consensus structure for causal inference. The structure learning process highlighted the importance of exploring data-driven approaches alongside clinical expertise. This includes discovering unexpected, although reasonable, relationships from a clinical perspective. Hypothetical interventions on Chronic Obstructive Pulmonary Disease, Alcohol dependence, and Diabetes suggest that the presence of any of these risk factors in patients increases the likelihood of Sepsis. This finding, alongside measuring the effect of these risk factors on Sepsis, has potential policy implications. Recognising the importance of prediction in improving health outcomes related to Sepsis, the model is also assessed in its ability to predict Sepsis by evaluating accuracy, sensitivity, and specificity. These three indicators all had results around 70%, and the AUC was 80%, which means the causal structure of the model is reasonably accurate given that the models were trained on data available for commissioning purposes only.

Investigating potential causes of Sepsis with Bayesian network structure learning

TL;DR

The paper tackles identifying causal drivers of Sepsis by learning Bayesian network structures from NHS England data and integrating clinical expertise with diverse structure-learning algorithms. It introduces a model-averaging approach and knowledge-based constraints to build a causally informed model suitable for prediction, intervention simulation, and policy guidance. Key findings show that COPD, Alcohol dependence, and Diabetes influence Sepsis risk and that interventions on these factors can meaningfully reduce Sepsis probability; predictive performance achieves AUC ~0.80 with ~70% accuracy. The work demonstrates a practical framework for policy-relevant causal inference in healthcare using routinely collected data, while acknowledging data limitations and the value of combining data-driven insights with expert knowledge.

Abstract

Sepsis is a life-threatening and serious global health issue. This study combines knowledge with available hospital data to investigate the potential causes of Sepsis that can be affected by policy decisions. We investigate the underlying causal structure of this problem by combining clinical expertise with score-based, constraint-based, and hybrid structure learning algorithms. A novel approach to model averaging and knowledge-based constraints was implemented to arrive at a consensus structure for causal inference. The structure learning process highlighted the importance of exploring data-driven approaches alongside clinical expertise. This includes discovering unexpected, although reasonable, relationships from a clinical perspective. Hypothetical interventions on Chronic Obstructive Pulmonary Disease, Alcohol dependence, and Diabetes suggest that the presence of any of these risk factors in patients increases the likelihood of Sepsis. This finding, alongside measuring the effect of these risk factors on Sepsis, has potential policy implications. Recognising the importance of prediction in improving health outcomes related to Sepsis, the model is also assessed in its ability to predict Sepsis by evaluating accuracy, sensitivity, and specificity. These three indicators all had results around 70%, and the AUC was 80%, which means the causal structure of the model is reasonably accurate given that the models were trained on data available for commissioning purposes only.
Paper Structure (26 sections, 4 figures, 9 tables)

This paper contains 26 sections, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Mechanisms and pathophysiology of Sepsis
  • Figure 2: Heatmap of the edges that appear two or more times in the graphs learnt by the six algorithms. The figure was produced using ggplot2 (bib38).
  • Figure 3: Distributions of the number of variables according to the number of direct effects (Number of edges in) and the number of direct causes (Number of edges out) across all the six structures learnt by the algorithms. The figure was produced using ggplot2 (bib38).
  • Figure 4: ROC curve of the classification problem solved with our CBN. The figure was produced using pROC (bib41) and ggplot2 (bib38).