Bayesian Uncertainty Estimation by Hamiltonian Monte Carlo: Applications to Cardiac MRI Segmentation
Yidong Zhao, Joao Tourais, Iain Pierce, Christian Nitsche, Thomas A. Treibel, Sebastian Weingärtner, Artur M. Schweidtmann, Qian Tao
TL;DR
This work tackles the challenge of unreliable uncertainty estimation in DL-based cardiac MRI segmentation by introducing HMC-CP, a scalable Bayesian framework that uses Hamiltonian Monte Carlo with cold posterior tempering and cyclical annealing to sample diverse posterior solutions. By aggregating voxel-wise uncertainties across posterior samples, the method produces calibrated voxel-level estimates and an image-level failure score, achieving improved calibration and segmentation accuracy on in-domain cine data and robust performance under substantial domain shifts to quantitative MRI. The study demonstrates that diversity in the functional space, captured via multi-modal HMC samples, correlates with better uncertainty estimates and that image-level failure detection achieves high AUC (up to about 91%) across datasets. Overall, HMC-CP provides a principled, efficient path toward trustworthy DL in clinical cardiac imaging, addressing silent failures and enabling practical failure detection through an aggregated confidence score.
Abstract
Deep learning (DL)-based methods have achieved state-of-the-art performance for many medical image segmentation tasks. Nevertheless, recent studies show that deep neural networks (DNNs) can be miscalibrated and overconfident, leading to "silent failures" that are risky for clinical applications. Bayesian DL provides an intuitive approach to DL failure detection, based on posterior probability estimation. However, the posterior is intractable for large medical image segmentation DNNs. To tackle this challenge, we propose a Bayesian learning framework using Hamiltonian Monte Carlo (HMC), tempered by cold posterior (CP) to accommodate medical data augmentation, named HMC-CP. For HMC computation, we further propose a cyclical annealing strategy, capturing both local and global geometries of the posterior distribution, enabling highly efficient Bayesian DNN training with the same computational budget as training a single DNN. The resulting Bayesian DNN outputs an ensemble segmentation along with the segmentation uncertainty. We evaluate the proposed HMC-CP extensively on cardiac magnetic resonance image (MRI) segmentation, using in-domain steady-state free precession (SSFP) cine images as well as out-of-domain datasets of quantitative T1 and T2 mapping. Our results show that the proposed method improves both segmentation accuracy and uncertainty estimation for in- and out-of-domain data, compared with well-established baseline methods such as Monte Carlo Dropout and Deep Ensembles. Additionally, we establish a conceptual link between HMC and the commonly known stochastic gradient descent (SGD) and provide general insight into the uncertainty of DL. This uncertainty is implicitly encoded in the training dynamics but often overlooked. With reliable uncertainty estimation, our method provides a promising direction toward trustworthy DL in clinical applications.
