Conformal Prediction Sets for Instance Segmentation
Kerri Lu, Dan M. Kluger, Stephen Bates, Sherrie Wang
TL;DR
This work addresses the lack of principled uncertainty quantification in instance segmentation by introducing a conformal prediction framework that outputs adaptive confidence sets of masks for pixel queries, with a provable IoU guarantee that at least one member exceeds a threshold $\tau$ with probability $1-\alpha$. By varying a tunable model parameter over a calibration set, selecting a minimal cover of parameter values, and post-processing to remove near-duplicates, the method yields diverse yet informative sets of masks that adapt to query difficulty. The approach provides both asymptotic and finite-sample guarantees and demonstrates improved coverage relative to Learn Then Test, Conformal Risk Control, and dilation baselines across agricultural field delineation, cell segmentation, and vehicle detection. This work enables reliable, interpretable uncertainty in practical segmentation tasks and highlights the value of predictive diversity when image ambiguity is high, with potential impact on downstream decision-making in domains like agriculture, biology, and autonomous driving.
Abstract
Current instance segmentation models achieve high performance on average predictions, but lack principled uncertainty quantification: their outputs are not calibrated, and there is no guarantee that a predicted mask is close to the ground truth. To address this limitation, we introduce a conformal prediction algorithm to generate adaptive confidence sets for instance segmentation. Given an image and a pixel coordinate query, our algorithm generates a confidence set of instance predictions for that pixel, with a provable guarantee for the probability that at least one of the predictions has high Intersection-Over-Union (IoU) with the true object instance mask. We apply our algorithm to instance segmentation examples in agricultural field delineation, cell segmentation, and vehicle detection. Empirically, we find that our prediction sets vary in size based on query difficulty and attain the target coverage, outperforming existing baselines such as Learn Then Test, Conformal Risk Control, and morphological dilation-based methods. We provide versions of the algorithm with asymptotic and finite sample guarantees.
