Conformalized Credal Regions for Classification with Ambiguous Ground Truth
Michele Caprio, David Stutz, Shuo Li, Arnaud Doucet
TL;DR
This work extends conformal prediction to probability-space credal regions, enabling empirical construction of convex, closed predictive regions for classification with ambiguous ground truth. By representing ground-truth uncertainty with plausibility regions $c(X_{n+1})$ and linking them to credal regions $\mathcal{P} = \{\text{Cat}(\lambda): \lambda \in c(X_{n+1})\}$, the authors achieve coverage guarantees at level $1-\alpha$ and introduce Impprecise Highest Density Sets (IHDS) to yield narrower predictive label sets with a $(1-\delta)(1-\alpha)$-type bound. The approach disentangles epistemic and aleatoric uncertainty and provides a practical, more efficient alternative to standard conformal prediction by producing smaller prediction sets while maintaining calibration, demonstrated on synthetic and real datasets with ambiguous ground truth. The results support robust uncertainty quantification in IPML scenarios and offer a principled framework for handling annotation-imprecision in supervised learning.
Abstract
An open question in \emph{Imprecise Probabilistic Machine Learning} is how to empirically derive a credal region (i.e., a closed and convex family of probabilities on the output space) from the available data, without any prior knowledge or assumption. In classification problems, credal regions are a tool that is able to provide provable guarantees under realistic assumptions by characterizing the uncertainty about the distribution of the labels. Building on previous work, we show that credal regions can be directly constructed using conformal methods. This allows us to provide a novel extension of classical conformal prediction to problems with ambiguous ground truth, that is, when the exact labels for given inputs are not exactly known. The resulting construction enjoys desirable practical and theoretical properties: (i) conformal coverage guarantees, (ii) smaller prediction sets (compared to classical conformal prediction regions) and (iii) disentanglement of uncertainty sources (epistemic, aleatoric). We empirically verify our findings on both synthetic and real datasets.
