Credal Concept Bottleneck Models: Structural Separation of Epistemic and Aleatoric Uncertainty
Tanmoy Mukherjee, Marius Kloft, Pierre Marquis, Zied Bouraoui
TL;DR
This work addresses the challenge of disentangling epistemic uncertainty (EU) from aleatoric uncertainty (AU) in predictive models. It introduces Credal Concept Bottleneck Models (Credal CBMs) that represent uncertainty as ellipsoidal credal sets parameterized in logit space, with EU determined by the credal set size and AU by within-set noise, and enforces structural separation via a frozen encoder and disjoint gradient signals across three heads. A Credal ELBO incorporating a Hausdorff KL regularizer ensures the credal set remains well-behaved while enabling gradient isolation, with theoretical guarantees of gradient separation and decorrelation. Empirically, the method achieves near-zero EU–AU correlation across multiple datasets (CeBaB, GoEmotions, MAQA*, AmbigQA*), improves alignment of EU with prediction error and AU with ground-truth ambiguity, and supports actionable quadrant-based routing for downstream tasks. The approach offers a principled design principle for trustworthy AI, enabling targeted decisions such as data collection, human review, or abstention based on the source of uncertainty, while outlining practical considerations and limitations for deployment.
Abstract
Decomposing predictive uncertainty into epistemic (model ignorance) and aleatoric (data ambiguity) components is central to reliable decision making, yet most methods estimate both from the same predictive distribution. Recent empirical and theoretical results show these estimates are typically strongly correlated, so changes in predictive spread simultaneously affect both components and blur their semantics. We propose a credal-set formulation in which uncertainty is represented as a set of predictive distributions, so that epistemic and aleatoric uncertainty correspond to distinct geometric properties: the size of the set versus the noise within its elements. We instantiate this idea in a Variational Credal Concept Bottleneck Model with two disjoint uncertainty heads trained by disjoint objectives and non-overlapping gradient paths, yielding separation by construction rather than post hoc decomposition. Across multi-annotator benchmarks, our approach reduces the correlation between epistemic and aleatoric uncertainty by over an order of magnitude compared to standard methods, while improving the alignment of epistemic uncertainty with prediction error and aleatoric uncertainty with ground-truth ambiguity.
