Generalising realisability in statistical learning theory under epistemic uncertainty

Fabio Cuzzolin

Generalising realisability in statistical learning theory under epistemic uncertainty

Fabio Cuzzolin

TL;DR

This paper investigates how core PAC guarantees and the realisability assumption in statistical learning theory extend when train and test distributions are drawn from a convex credal set representing epistemic uncertainty. It revisits classical bounds for finite realisable cases and extends to infinite hypothesis spaces using uniform convergence, McDiarmid's inequality, and Rademacher complexity, then provides a sketch of credal generalisation focusing on max-risk over the credal set and uniform credal realisability concepts. The authors derive a distribution-free bound in the finite realisable setting and outline two plausible credal extensions, highlighting challenges posed by distributional ambiguity and the need for new concentration tools. The discussion points to future work on random-set formalisms, concentration inequalities for credal models, and alternative learning frameworks that better capture epistemic uncertainty in data-generating processes.

Abstract

The purpose of this paper is to look into how central notions in statistical learning theory, such as realisability, generalise under the assumption that train and test distribution are issued from the same credal set, i.e., a convex set of probability distributions. This can be considered as a first step towards a more general treatment of statistical learning under epistemic uncertainty.

Generalising realisability in statistical learning theory under epistemic uncertainty

TL;DR

Abstract

Paper Structure (12 sections, 2 theorems, 38 equations)

This paper contains 12 sections, 2 theorems, 38 equations.

Introduction
Probably Approximately Correct (PAC) learning
Results from statistical learning theory
Bounds for finite, realisable case
PAC bounds for finite model spaces
PAC bounds for infinite model spaces
Derivation
Symmetrisation and notion of "ghost" dataset
Rademacher's complexity
Bounds for realisable finite hypothesis classes
Bounds under credal generalisation: a sketch
Discussion and conclusions

Key Result

theorem thmcountertheorem

Let $\mathcal{H}$ be a hypothesis class, where each hypothesis $h \in \mathcal{H}$ maps some $\mathcal{X}$ to $\mathcal{Y}$, $l$ be the zero-one loss: $l((x, y), h) = \mathbb{I}[y \neq h(x)]$, $p^*$ be any distribution over $\mathcal{X} \times \mathcal{Y}$ and $\hat{h}$ be the empirical risk minimis

Theorems & Definitions (5)

definition thmcounterdefinition
theorem thmcountertheorem
proof
theorem thmcountertheorem
proof

Generalising realisability in statistical learning theory under epistemic uncertainty

TL;DR

Abstract

Generalising realisability in statistical learning theory under epistemic uncertainty

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (5)