Structured Credal Learning

Varun Venkatesh; Eyke Hüllermeier; Bernd Bischl; Mina Rezaei

Structured Credal Learning

Varun Venkatesh, Eyke Hüllermeier, Bernd Bischl, Mina Rezaei

Abstract

Real-world learning tasks often encounter uncertainty due to covariate shift and noisy or inconsistent labels. However, existing robust learning methods merge these effects into a single distributional uncertainty set. In this work, we introduce a novel structured credal learning framework that explicitly separates these two sources. Specifically, we derive geometric bounds on the total variation diameter of structured credal sets and demonstrate how this quantity decomposes into contributions from covariate shift and expected label disagreement. This decomposition reveals a gating effect: covariate modulates how much label disagreement contributes to the joint uncertainty, such that seemingly benign covariate shifts can substantially increase the effective uncertainty. We also establish finite-sample concentration bounds in a fixed covariate regime and demonstrate that this quantity can be efficiently estimated. Lastly, we show that robust optimization over these structured credal sets reduces to a tractable discrete min-max problem, avoiding ad-hoc robustness parameters. Overall, our approach provides a principled and practical foundation for robust learning under combined covariate and label mechanism ambiguity.

Structured Credal Learning

Abstract

Paper Structure (46 sections, 12 theorems, 94 equations, 3 figures, 10 tables)

This paper contains 46 sections, 12 theorems, 94 equations, 3 figures, 10 tables.

Introduction
Related work
Structured Credal Learning Framework
Preliminary: Credal Learning Theory
Notation.
Covariate--Labeling Uncertainty
Structured Credal Set
Theoretical Analysis: Geometry of the Structured Credal Set
Total Variation Decomposition
Interpretation.
Interpretation.
Diameter Bounds for Product Credal Sets
The Pure Labeling Uncertainty Regime
Geometric Characterization: Diameter as Disagreement
The Observable Radius: From Hyperparameter to Statistical Estimation
...and 31 more sections

Key Result

Lemma 1

Let $\mathcal{P}$ be the structured credal set. Then:

Figures (3)

Figure 1: The Gating Effect. Illustration of the interaction between environment and labelers. We simulate two labelers with disjoint decision boundaries (sigmoid thresholds at $x=\pm 1$) and slide a Gaussian covariate window across the input space. While the covariate shift remains constant ($d_{TV}(P_X^{(i)}, P_X^{(i')}) \approx 0.38$, gray dashed line), the observable joint distributional distance (red solid line) spikes significantly when the environment concentrates probability mass in the region where labelers disagree. This demonstrates that the environment acts as a "gate", determining the visibility of labeler disagreement; seemingly benign covariate shifts can become catastrophic if they expose previously latent labeler disagreements. The blue line represents the theoretical upper bound derived in \ref{['prp:general-bounds']}.
Figure 2: Proposition \ref{['prp:general-bounds']} shows that joint distributional divergence results from the interaction between covariate shift and labeling disagreement, with bounds corresponding to two interpolation paths between feature and label changes. Label disagreement affects the joint distance only on regions with non-negligible covariate mass, yielding a gating effect in which covariate shift controls the visibility of labeling uncertainty. Consequently, training distributions may underestimate risk when deployment concentrates probability mass in high-disagreement regions, a coupling made explicit by the structured credal framework.
Figure 3: Empirical concentration behavior of estimation error under different data distributions and labeling mechanisms. Across settings, the observed error exhibits the expected theoretical scaling with sample size, and is compared against Hoeffding-type union bounds to highlight the gap between empirical performance and worst-case guarantees.

Theorems & Definitions (31)

Definition 1: Feature and Label Distributions
Definition 2: Joint Distribution
Definition 3: Structured Credal Set
Remark 1: Justification and Construction
Lemma 1: Extremal Supremum Property
Proposition 1: Label Disagreement Distance
Proposition 2: Environment Shift Distance
Theorem 4.1: General Distance Bounds
Definition 4: Component Diameters
Theorem 4.2: Diameter Decomposition
...and 21 more

Structured Credal Learning

Abstract

Structured Credal Learning

Authors

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (31)