Bias-Aware Conformal Prediction for Metric-Based Imaging Pipelines
Matt Y. Cheung, Tucker J. Netherton, Laurence E. Court, Ashok Veeraraghavan, Guha Balakrishnan
TL;DR
The paper tackles the problem that conventional conformal prediction intervals degrade in imaging pipelines due to bias between reconstruction objectives and downstream metrics. It develops a bias-aware theoretical framework for Split CP with $L_1$ and CQR non-conformity scores, showing that symmetric interval length upper bounds grow by $2|b|$ under additive bias, while asymmetric intervals remain unchanged. Empirical validation on sparse-view CT for radiotherapy planning confirms the theory and demonstrates that using asymmetric formulations can yield shorter, yet valid, prediction intervals when bias is present. This work provides actionable guidance for bias-aware uncertainty quantification in medical imaging pipelines, enhancing decision-support confidence in high-stakes clinical settings.
Abstract
Reliable confidence measures of metrics derived from medical imaging reconstruction pipelines would improve the standard of decision-making in many clinical workflows. Conformal Prediction (CP) provides a robust framework for producing calibrated prediction intervals, but standard CP formulations face a critical challenge in the imaging pipeline: common mismatches between image reconstruction objectives and downstream metrics can introduce systematic prediction deviations from ground truth values, known as bias. These biases in turn compromise the efficiency of prediction intervals, which is a problem that has been unexplored in the CP literature. In this study, we formalize the behavior of symmetric (where bounds expand equally in both directions) and asymmetric (where bounds expand unequally) formulations for common non-conformity scores in CP in the presence of bias, and argue that this measurable bias must inform the choice of CP formulation. We theoretically and empirically demonstrate that symmetric intervals are inflated by a factor of two times the magnitude of bias while asymmetric intervals remain unaffected by bias, and provide conditions under which each formulation produces tighter intervals. We empirically validated our theoretical analyses on sparse-view CT reconstruction for downstream radiotherapy planning. Our work enables users of medical imaging pipelines to proactively select optimal CP formulations, thereby improving interval length efficiency for critical downstream metrics.
