The Sample Complexity of Replicable Realizable PAC Learning

Kasper Green Larsen; Markus Engelund Mathiasen; Chirag Pabbaraju; Clement Svendsen

The Sample Complexity of Replicable Realizable PAC Learning

Kasper Green Larsen, Markus Engelund Mathiasen, Chirag Pabbaraju, Clement Svendsen

TL;DR

A particularly hard learning problem is constructed and a sample complexity lower bound is shown with a close to $(\log|H|)^{3/2}$ dependence on the size of the hypothesis class $H$.

Abstract

In this paper, we consider the problem of replicable realizable PAC learning. We construct a particularly hard learning problem and show a sample complexity lower bound with a close to $(\log|H|)^{3/2}$ dependence on the size of the hypothesis class $H$. Our proof uses several novel techniques and works by defining a particular Cayley graph associated with $H$ and analyzing a suitable random walk on this graph by examining the spectral properties of its adjacency matrix. Furthermore, we show an almost matching upper bound for the lower bound instance, meaning if a stronger lower bound exists, one would have to consider a different instance of the problem.

The Sample Complexity of Replicable Realizable PAC Learning

TL;DR

A particularly hard learning problem is constructed and a sample complexity lower bound is shown with a close to

dependence on the size of the hypothesis class

Abstract

In this paper, we consider the problem of replicable realizable PAC learning. We construct a particularly hard learning problem and show a sample complexity lower bound with a close to

dependence on the size of the hypothesis class

. Our proof uses several novel techniques and works by defining a particular Cayley graph associated with

and analyzing a suitable random walk on this graph by examining the spectral properties of its adjacency matrix. Furthermore, we show an almost matching upper bound for the lower bound instance, meaning if a stronger lower bound exists, one would have to consider a different instance of the problem.

Paper Structure (25 sections, 31 theorems, 125 equations, 1 figure)

This paper contains 25 sections, 31 theorems, 125 equations, 1 figure.

Introduction
Main Results
Open Problems.
Further Related Work
Notation
Technical Overview
Lower Bound
Step 1.
Step 2.
Step 3.
Upper Bound
Proof of the Lower Bound
Random Step Approach
Part 1.
Part 2.
...and 10 more sections

Key Result

Theorem 1.2

For any integer $d \geq 10^{11}$, and positive reals $\varepsilon,\delta,\rho \leq 10^{-4}$, there exists a domain ${\mathcal{X}}$, a hypothesis class ${\mathcal{H}} \subseteq\{0, 1\}^{\mathcal{X}}$ with VC-dimension $d$, such that for any algorithm ${\mathcal{A}}$ there is a distribution ${\mathcal labeled samples from ${\mathcal{D}}$ in order to be a $\rho$-replicable PAC learner for ${\mathcal{

Figures (1)

Figure 1: Example with $k = 7$ for hypothesis $h_i$ for $i = (0, 5, 2)$. Each interval shows which values of $b$ will make $h_i((a, b)) = 1$. For instance, if $a = 1$ then $h_i((a, b)) = 1$ for $b \in \{0, 5, 6\}$.

Theorems & Definitions (55)

Definition 1.1: $\rho$-replicability reproducibility_in_learning
Theorem 1.2: Replicable Learning Lower Bound
Theorem 1.3: Replicable Learning Upper Bound
Theorem 3.1: Replicable Learning Lower Bound
Definition 3.1: Mode
Lemma 3.1
Theorem 3.2
Lemma 3.2
Lemma 3.3: Random step
Lemma 3.3
...and 45 more

The Sample Complexity of Replicable Realizable PAC Learning

TL;DR

Abstract

The Sample Complexity of Replicable Realizable PAC Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (55)