Measuring Orthogonality as the Blind-Spot of Uncertainty Disentanglement
Ivo Pascal de Jong, Andreea Ioana Sburlea, Matthia Sabatelli, Matias Valdenegro-Toro
TL;DR
This work tackles the blind spot in uncertainty disentanglement by insisting on orthogonal separation between aleatoric ($U_a$) and epistemic ($U_e$) uncertainties and introducing Uncertainty Disentanglement Error (UDE) to quantify adherence to both consistency and orthogonality. It analyzes Gaussian Logits and Information Theoretic disentangling, revealing that total uncertainty formulations can mask leakage between sources, and demonstrates that orthogonality is not guaranteed even for state-of-the-art methods. Through controlled experiments manipulating dataset size and label noise across multiple domains and Bayesian UQ approaches, the authors show that Information Theoretic disentangling often yields better consistency and partial orthogonality, particularly for $U_e$, but fails to achieve full orthogonality for $U_a$, especially on large-scale data like ImageNet-1k. The paper proposes UDE as a practical metric for evaluating disentanglement quality and highlights that training regime (from scratch vs pretrained) substantially affects results, emphasizing caution when applying disentangled uncertainties in high-stakes decisions.
Abstract
Aleatoric (data) and epistemic (knowledge) uncertainty are textbook components of Uncertainty Quantification. Jointly estimating these components has been shown to be problematic and non-trivial. As a result, there are multiple ways to disentangle these uncertainties, but current methods to evaluate them are insufficient. We propose that aleatoric and epistemic uncertainty estimates should be orthogonally disentangled - meaning that each uncertainty is not affected by the other - a necessary condition that is often not met. We prove that orthogonality and consistency and necessary and sufficient criteria for disentanglement, and construct Uncertainty Disentanglement Error as a metric to measure these criteria, with further empirical evaluation showing that finetuned models give different orthogonality results than models trained from scratch and that UDE can be optimized for through dropout rate. We demonstrate a Deep Ensemble trained from scratch on ImageNet-1k with Information Theoretic disentangling achieves consistent and orthogonal estimates of epistemic uncertainty, but estimates of aleatoric uncertainty still fail on orthogonality.
