Table of Contents
Fetching ...

From learnable objects to learnable random objects

Aaron Anderson, Michael Benedikt

TL;DR

This work establishes that learnability properties of a base hypothesis class are preserved when moving to associated statistical classes derived from distributions over either the range or the parameter space, under agnostic PAC and agnostic online learning. The authors develop a unified framework via measurable families and their expectation classes, drawing on model-theoretic randomization to connect base learnability with distribution-based hypotheses. They provide explicit sample-complexity bounds for both concept-valued and real-valued base classes, expressed in terms of base fat-shattering and GC dimensions, and show that preservation can fail in realizable settings. The results leverage mean-width techniques (Rademacher/Gaussian), stability notions, and sequential fat-shattering dimensions to translate base-class properties into bounds for the derived statistical classes. Overall, the paper bridges machine learning and model theory to quantify when and how learnability transfers across randomized constructions, while highlighting fundamental limitations in the realizable regime and suggesting robust online-learning approaches via alternative loss functions.

Abstract

We consider the relationship between learnability of a "base class" of functions on a set $X$, and learnability of a class of statistical functions derived from the base class. For example, we refine results showing that learnability of a family $h_p: p \in Y$ of functions implies learnability of the family of functions $h_μ=λp: Y. E_μ(h_p)$, where $E_μ$ is the expectation with respect to $μ$, and $μ$ ranges over probability distributions on $X$. We will look at both Probably Approximately Correct (PAC) learning, where example inputs and outputs are chosen at random, and online learning, where the examples are chosen adversarily. For agnostic learning, we establish improved bounds on the sample complexity of learning for statistical classes, stated in terms of combinatorial dimensions of the base class. We connect these problems to techniques introduced in model theory for "randomizing a structure". We also provide counterexamples for realizable learning, in both the PAC and online settings.

From learnable objects to learnable random objects

TL;DR

This work establishes that learnability properties of a base hypothesis class are preserved when moving to associated statistical classes derived from distributions over either the range or the parameter space, under agnostic PAC and agnostic online learning. The authors develop a unified framework via measurable families and their expectation classes, drawing on model-theoretic randomization to connect base learnability with distribution-based hypotheses. They provide explicit sample-complexity bounds for both concept-valued and real-valued base classes, expressed in terms of base fat-shattering and GC dimensions, and show that preservation can fail in realizable settings. The results leverage mean-width techniques (Rademacher/Gaussian), stability notions, and sequential fat-shattering dimensions to translate base-class properties into bounds for the derived statistical classes. Overall, the paper bridges machine learning and model theory to quantify when and how learnability transfers across randomized constructions, while highlighting fundamental limitations in the realizable regime and suggesting robust online-learning approaches via alternative loss functions.

Abstract

We consider the relationship between learnability of a "base class" of functions on a set , and learnability of a class of statistical functions derived from the base class. For example, we refine results showing that learnability of a family of functions implies learnability of the family of functions , where is the expectation with respect to , and ranges over probability distributions on . We will look at both Probably Approximately Correct (PAC) learning, where example inputs and outputs are chosen at random, and online learning, where the examples are chosen adversarily. For agnostic learning, we establish improved bounds on the sample complexity of learning for statistical classes, stated in terms of combinatorial dimensions of the base class. We connect these problems to techniques introduced in model theory for "randomizing a structure". We also provide counterexamples for realizable learning, in both the PAC and online settings.

Paper Structure

This paper contains 28 sections, 44 theorems, 107 equations, 1 table.

Key Result

Proposition 3

A function class is agnostic online learnable exactly when its dual class is.

Theorems & Definitions (73)

  • Definition 1: Fat shattering
  • Definition 2: Sequential fat-shattering dimension and Littlestone dimension
  • Proposition 3
  • Definition 4: Distribution class and Dual Distribution class
  • Definition 5: Measurable family of hypothesis classes
  • Definition 6: Expectation class of a measurable family of hypothesis classes
  • Definition 7: $X$-valued random variables
  • Definition 8: Compatibility with a hypothesis class; Randomized version of a hypothesis class
  • Definition 9: Parameter randomized version of a hypothesis class
  • Definition 10: Parameter randomized expectation class
  • ...and 63 more