Principled Confidence Estimation for Deep Computed Tomography

Matteo Gätzner; Johannes Kirschner

Principled Confidence Estimation for Deep Computed Tomography

Matteo Gätzner, Johannes Kirschner

TL;DR

This work tackles the lack of uncertainty quantification in deep CT reconstructions by introducing principled confidence estimation via sequential likelihood mixing, yielding anytime-valid confidence sets under a realistic Beer-Lambert Poisson forward model. It integrates deep priors, including U-Net ensembles and diffusion models, as mixing distributions to produce substantially tighter confidence regions without sacrificing coverage. The framework supports practical tools such as hallucination detection and interpretable pixel-wise uncertainty visualizations, demonstrated across medical, industrial, and materials CT datasets. By connecting sequential statistics with modern generative priors, the paper offers a pathway to trustworthy, uncertainty-aware deep CT reconstructions suitable for safety-critical applications.

Abstract

We present a principled framework for confidence estimation in computed tomography (CT) reconstruction. Based on the sequential likelihood mixing framework (Kirschner et al., 2025), we establish confidence regions with theoretical coverage guarantees for deep-learning-based CT reconstructions. We consider a realistic forward model following the Beer-Lambert law, i.e., a log-linear forward model with Poisson noise, closely reflecting clinical and scientific imaging conditions. The framework is general and applies to both classical algorithms and deep learning reconstruction methods, including U-Nets, U-Net ensembles, and generative Diffusion models. Empirically, we demonstrate that deep reconstruction methods yield substantially tighter confidence regions than classical reconstructions, without sacrificing theoretical coverage guarantees. Our approach allows the detection of hallucinations in reconstructed images and provides interpretable visualizations of confidence regions. This establishes deep models not only as powerful estimators, but also as reliable tools for uncertainty-aware medical imaging.

Principled Confidence Estimation for Deep Computed Tomography

TL;DR

Abstract

Paper Structure (41 sections, 1 theorem, 16 equations, 17 figures)

This paper contains 41 sections, 1 theorem, 16 equations, 17 figures.

Introduction
Contributions.
Related Work
Uncertainty Quantification in Tomography.
Confidence Sequences and Sequential Inference.
Setting
Physical Forward Model and Likelihood
Sequential Data Acquisition Protocol
Methodology
Sequential Likelihood Mixing
Uncertainty Estimates for Generative Models
Experimental Results
Datasets
Reconstruction Methods and Mixing Strategies
Evaluation Protocol
...and 26 more sections

Key Result

Proposition 3.2

Let $(\mu_s)_{s \in \mathbb{N}_0}$ be a sequence of distributions on $\mathcal{X}$, where for each $s \in \mathbb{N}$, $\mu_{s}$ depends only on data observed up to step $s$. For any error level $\delta \in (0,1)$, define the threshold Then $(C_t)_{t=1}^\infty$ with $C_t \coloneqq \left\{ {\mathbf{x}} \in \mathcal{X} \mid L_t({\mathbf{x}}) \leq \beta_{t} + \log \frac{1}{\delta}\right\}$ for all $

Figures (17)

Figure 1: Groud truth samples, measurement data (sinogram) and final reconstructions at total intensity $I_{\text{total}} = 10^7$. While classical algorithms (FBP, MLE) exhibit significant noise and artifacts, the deep learning predictors (U-Net, U-Net Ensemble, Diffusion) successfully recover fine structural details and sharp edges.
Figure 2: PSNR of the final reconstruction vs. total intensity.
Figure 3: Difference between sequential negative log-likelihood and ground truth image negative log-likelihood $\beta_{t_\text{final}} - L_{t_\text{final}}({\mathbf{x}}^\ast)$. Shaded regions indicate mean $\pm$ SEM (100 test set images, 10 seeds).
Figure 4: Crossover and exclusion rates vs. total intensity for Lamino dataset and error level $\delta=0.05$, corresponding to type-I and type-II error rates. Crossover rate: rate at which the ground truth image is not inside all confidence sets of the confidence sequence. Exclusion rate: rate at which the rotated ground truth image is not included in the last confidence set. Shaded regions indicate mean $\pm$ SEM (crossover rate: 100 test set images and 10 seeds, exclusion rate: 100 test set images and one seed).
Figure 5: Pixel-wise confidence interval widths and coverage rates of the ground truth pixels computed for $\delta=0.05$. The Worst-Case method is overly conservative ($\approx 99\%$), while Diffusion $C_t$-Boundary Sampling maintains high coverage with significantly tighter bounds. Shaded regions indicate mean $\pm$ standard deviation (100 test set images).
...and 12 more figures

Theorems & Definitions (2)

Definition 3.1: Confidence Sequence
Proposition 3.2: Sequential Likelihood Mixing for CT

Principled Confidence Estimation for Deep Computed Tomography

TL;DR

Abstract

Principled Confidence Estimation for Deep Computed Tomography

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (17)

Theorems & Definitions (2)