Table of Contents
Fetching ...

CARE: Confidence-aware Ratio Estimation for Medical Biomarkers

Jiameng Li, Teodora Popordanoska, Aleksei Tiulpin, Sebastian G. Gruber, Frederik Maes, Matthew B. Blaschko

Abstract

Ratio-based biomarkers (RBBs), such as the proportion of necrotic tissue within a tumor, are widely used in clinical practice to support diagnosis, prognosis, and treatment planning. These biomarkers are typically estimated from segmentation outputs by computing region-wise ratios. Despite the high-stakes nature of clinical decision making, existing methods provide only point estimates, offering no measure of uncertainty. In this work, we propose a unified confidence-aware framework for estimating ratio-based biomarkers. Our uncertainty analysis stems from two observations: (1) the probability ratio estimator inherently admits a statistical confidence interval regarding local randomness (bias and variance); (2) the segmentation network is not perfectly calibrated (calibration error).We perform a systematic analysis of error propagation in the segmentation-to-biomarker pipeline and identify model miscalibration as the dominant source of uncertainty. Extensive experiments show that our method produces statistically sound confidence intervals, with tunable confidence levels, enabling more trustworthy application of segmentation-derived RBBs in clinical workflows.

CARE: Confidence-aware Ratio Estimation for Medical Biomarkers

Abstract

Ratio-based biomarkers (RBBs), such as the proportion of necrotic tissue within a tumor, are widely used in clinical practice to support diagnosis, prognosis, and treatment planning. These biomarkers are typically estimated from segmentation outputs by computing region-wise ratios. Despite the high-stakes nature of clinical decision making, existing methods provide only point estimates, offering no measure of uncertainty. In this work, we propose a unified confidence-aware framework for estimating ratio-based biomarkers. Our uncertainty analysis stems from two observations: (1) the probability ratio estimator inherently admits a statistical confidence interval regarding local randomness (bias and variance); (2) the segmentation network is not perfectly calibrated (calibration error).We perform a systematic analysis of error propagation in the segmentation-to-biomarker pipeline and identify model miscalibration as the dominant source of uncertainty. Extensive experiments show that our method produces statistically sound confidence intervals, with tunable confidence levels, enabling more trustworthy application of segmentation-derived RBBs in clinical workflows.

Paper Structure

This paper contains 38 sections, 7 theorems, 39 equations, 7 figures, 5 tables.

Key Result

proposition 1

romano2019conformalized Given ground-truth $r_{\mathrm{gt}}$, prediction $\hat{r}$ and the absolute error residual $e_r \coloneqq |r_{\mathrm{gt}} - \hat{r}|$, let $q_{e_r, \delta}$ denote the $\frac{n+1}{n}(1-\delta)$ quantile of the instance-wise $e_r$ on a validation set $\mathcal{D}_\text{val}$ $\blacktriangleleft$$\blacktriangleleft$

Figures (7)

  • Figure 1: Examples of ratio-based biomarkers and their roles in clinical support. (a): Ratio-based biomarkers baid2021rsnamyronenko2023automated exist in many organs and modalities. (b): An illustrative example where a high-risk threshold is defined as $0.25$; Care calls for human check when confidence intervals cross the thresholds.
  • Figure 2: Overview. In automated medical imaging analysis, biomarkers are often computed from network predictions. To quantify the uncertainty of ratio-based biomarkers, we introduce Care, a confidence-aware estimation method providing reliable confidence intervals.
  • Figure 3: Comparison of adaptiveness on nnUNet$_\mathrm{3d}$ ($C=0.68$). (a) The frequency histogram of NTR intervals in test-set. ACQR's intervals lie frequently around the middle area, while Care has tighter bounds generally. (b) The average interval width in three groups categorized by tumor sizes. Intuitively, interval width should reflect MSE$_\mathrm{r}$ tendency. Compared with the indistinguishable CQR and overconservative ACQR, Care varies appropriately wider for small tumors (hard samples) and tighter for large ones (simple).
  • Figure 4: Further study on MSD-Task01 and nnUNet$_\text{3d}$ ($C=0.68$). (a) Care satisfies the desired confidence levels consistently. (b) When the temperature moves towards better calibration (ECE $\downarrow$), our interval becomes narrower (Interval $\downarrow$). (c) Miscalibration is the main contributor to the overall uncertainty, since the ECE-only interval $I_\mathrm{ECE}$ takes the dominant portion of the overall interval $I_\mathrm{O}$.
  • Figure A: Our confidence interval considering estimation and miscalibration. (a) shows Markov bounds from the estimator. (b) illustrates the prediction offset $\epsilon_{l,u}$ due to miscalibration. (c) is the overall confidence interval $r \in [B_l,B_u]$.
  • ...and 2 more figures

Theorems & Definitions (13)

  • definition 1: Ratio from Segmentation Networks
  • definition 2: Confidence Interval
  • definition 3: Empirical Coverage Rate
  • proposition 1: Conformalized Quantile Regression (CQR)
  • proposition 2: Adaptive Conformalized Quantile Regression (ACQR)
  • remark 1: Uncertainty Measure in Tumors
  • definition 4: Volume Bias (V-Bias)
  • definition 5: Calibration Error (CE)
  • proposition 3: The Relationship of V-Bias and CE
  • proposition 4: Estimation-based Confidence Interval
  • ...and 3 more