Table of Contents
Fetching ...

Bayesian Modelling in Practice: Using Uncertainty to Improve Trustworthiness in Medical Applications

David Ruhe, Giovanni Cinà, Michele Tonutti, Daan de Bruin, Paul Elbers

TL;DR

This work tackles the problem of unquantified risk in ICU mortality prediction by incorporating predictive uncertainty through a Bayesian Neural Network trained with Bayes By Backprop. The authors derive analytic bounds that relate cross-entropy loss to predictive variance and demonstrate that uncertainty can both mitigate loss and flag out-of-domain patients on the MIMIC-III dataset. Key contributions include the use of BBB to estimate model uncertainty, empirical evidence that uncertainty improves reliability for high-stakes predictions, and robust signaling of domain shift (e.g., newborns, minority groups). The findings highlight the potential of uncertainty-aware decision support to enhance safety and trust in critical care settings.

Abstract

The Intensive Care Unit (ICU) is a hospital department where machine learning has the potential to provide valuable assistance in clinical decision making. Classical machine learning models usually only provide point-estimates and no uncertainty of predictions. In practice, uncertain predictions should be presented to doctors with extra care in order to prevent potentially catastrophic treatment decisions. In this work we show how Bayesian modelling and the predictive uncertainty that it provides can be used to mitigate risk of misguided prediction and to detect out-of-domain examples in a medical setting. We derive analytically a bound on the prediction loss with respect to predictive uncertainty. The bound shows that uncertainty can mitigate loss. Furthermore, we apply a Bayesian Neural Network to the MIMIC-III dataset, predicting risk of mortality of ICU patients. Our empirical results show that uncertainty can indeed prevent potential errors and reliably identifies out-of-domain patients. These results suggest that Bayesian predictive uncertainty can greatly improve trustworthiness of machine learning models in high-risk settings such as the ICU.

Bayesian Modelling in Practice: Using Uncertainty to Improve Trustworthiness in Medical Applications

TL;DR

This work tackles the problem of unquantified risk in ICU mortality prediction by incorporating predictive uncertainty through a Bayesian Neural Network trained with Bayes By Backprop. The authors derive analytic bounds that relate cross-entropy loss to predictive variance and demonstrate that uncertainty can both mitigate loss and flag out-of-domain patients on the MIMIC-III dataset. Key contributions include the use of BBB to estimate model uncertainty, empirical evidence that uncertainty improves reliability for high-stakes predictions, and robust signaling of domain shift (e.g., newborns, minority groups). The findings highlight the potential of uncertainty-aware decision support to enhance safety and trust in critical care settings.

Abstract

The Intensive Care Unit (ICU) is a hospital department where machine learning has the potential to provide valuable assistance in clinical decision making. Classical machine learning models usually only provide point-estimates and no uncertainty of predictions. In practice, uncertain predictions should be presented to doctors with extra care in order to prevent potentially catastrophic treatment decisions. In this work we show how Bayesian modelling and the predictive uncertainty that it provides can be used to mitigate risk of misguided prediction and to detect out-of-domain examples in a medical setting. We derive analytically a bound on the prediction loss with respect to predictive uncertainty. The bound shows that uncertainty can mitigate loss. Furthermore, we apply a Bayesian Neural Network to the MIMIC-III dataset, predicting risk of mortality of ICU patients. Our empirical results show that uncertainty can indeed prevent potential errors and reliably identifies out-of-domain patients. These results suggest that Bayesian predictive uncertainty can greatly improve trustworthiness of machine learning models in high-risk settings such as the ICU.

Paper Structure

This paper contains 11 sections, 12 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The reachable loss region in a Bayesian binary classification problem.
  • Figure 2: Cumulative prediction loss ($y\text{-axis}$) related to the amount of uncertain testing data included in the analysis ($x\text{-axis}$). The uncertainty effectively mitigates prediction loss for both the BNN (a) and the Gradient Boosting model (b).
  • Figure 3: In plot \ref{['fig:pred_to_std']} we depict the predictive uncertainty ($x\text{-axis}$) related to predictive uncertainty ($y\text{-axis}$). In plot \ref{['fig: train_newborns_std']} we see how the BNN effectively identifies out-of-domain examples.
  • Figure 4: Illustration of how the data follows the bounds.
  • Figure 5: Predictive performance measured in AUROC ($y\text{-axis}$) related to the amount of uncertain data included in the analysis ($x\text{-axis}$).
  • ...and 1 more figures