Table of Contents
Fetching ...

Which Algorithms Have Tight Generalization Bounds?

Michael Gastpar, Ido Nachum, Jonathan Shafer, Thomas Weinberger

TL;DR

The paper investigates when algorithm-dependent tight generalization bounds exist by formalizing estimability and studying overparameterized settings. It proves inestimability results for inductive biases toward VC classes and toward nearly-orthogonal function families, showing that distribution-free estimators can fail to approximate population loss in these regimes. It then identifies sufficient conditions for estimability via algorithm stability and provides a simple, necessary-and-sufficient variance-based characterization of estimability. The work clarifies why many classical generalization bounds are vacuous for modern models and offers principled paths to derive tight, algorithm-dependent bounds grounded in stability and loss-variance properties.

Abstract

We study which machine learning algorithms have tight generalization bounds. First, we present conditions that preclude the existence of tight generalization bounds. Specifically, we show that algorithms that have certain inductive biases that cause them to be unstable do not admit tight generalization bounds. Next, we show that algorithms that are sufficiently stable do have tight generalization bounds. We conclude with a simple characterization that relates the existence of tight generalization bounds to the conditional variance of the algorithm's loss.

Which Algorithms Have Tight Generalization Bounds?

TL;DR

The paper investigates when algorithm-dependent tight generalization bounds exist by formalizing estimability and studying overparameterized settings. It proves inestimability results for inductive biases toward VC classes and toward nearly-orthogonal function families, showing that distribution-free estimators can fail to approximate population loss in these regimes. It then identifies sufficient conditions for estimability via algorithm stability and provides a simple, necessary-and-sufficient variance-based characterization of estimability. The work clarifies why many classical generalization bounds are vacuous for modern models and offers principled paths to derive tight, algorithm-dependent bounds grounded in stability and loss-variance properties.

Abstract

We study which machine learning algorithms have tight generalization bounds. First, we present conditions that preclude the existence of tight generalization bounds. Specifically, we show that algorithms that have certain inductive biases that cause them to be unstable do not admit tight generalization bounds. Next, we show that algorithms that are sufficiently stable do have tight generalization bounds. We conclude with a simple characterization that relates the existence of tight generalization bounds to the conditional variance of the algorithm's loss.
Paper Structure (21 sections, 10 theorems, 66 equations, 4 figures)

This paper contains 21 sections, 10 theorems, 66 equations, 4 figures.

Key Result

Theorem 1

Let $\mathcal{H} \subseteq \{\pm 1\}^\mathcal{X}$ be a hypothesis class with VC dimension $d$ large enough, and let $m \leq \sqrt{d}/10$. Then there exists a subset $\mathcal{F} \subseteq \mathcal{H}$ and corresponding realizable distributions $\mathbb{D}$ such that any learning rule that has an ind

Figures (4)

  • Figure 1: MNIST
  • Figure 2: FashionMNIST
  • Figure 3: CIFAR10
  • Figure 4: CIFAR10 with random labels

Theorems & Definitions (40)

  • Definition 1.2: Estimability
  • Definition 1.4: Overparameterized setting
  • Example 1.5: Perfect learnability does not imply perfect estimability
  • Example 1.6: Constant algorithms are estimable
  • Example 1.7: Memorization
  • Example 1.8: Most algorithms are estimable
  • Example 1.9: Parity functions
  • Theorem : Informal version of \ref{['theorem:vc-class-estimability-bound']}
  • Theorem : Informal version of \ref{['theorem:orthogonal-functions']}
  • Theorem : Informal version of \ref{['theorem:stablity-implies-estimability']}
  • ...and 30 more