Table of Contents
Fetching ...

On the Computability of Multiclass PAC Learning

Pascale Gourdeau, Tosca Lechner, Ruth Urner

TL;DR

This paper studies computable multiclass PAC (CPAC) learnability for finite label spaces, introducing computable dimensions and a meta-characterization based on distinguishers. It defines the computable Natarajan dimension $c\text{-}\mathsf{N}(\mathcal{H})$ and the computable graph dimension $c\text{-}\mathsf{G}(\mathcal{H})$, and proves that finiteness of these dimensions is necessary (and, with $|\mathcal Y|<\infty$, sufficient) for CPAC learnability. A broad framework using computable $\Psi$-dimensions unifies several multiclass dimensions and yields a meta-characterization: CPAC learnability iff the corresponding $c\text{-}\Psi\text{-dim}$ is finite for a distinguisher $\Psi$, with the DS dimension not expressible via such distinguishers. The results clarify the computable multiclass learning landscape, linking computable dimensions to learnability, and show that infinite-label scenarios (DS) require fundamentally different tools. These insights inform the design of computable multiclass learners and illuminate the boundary between computational and information-theoretic learnability.

Abstract

We study the problem of computable multiclass learnability within the Probably Approximately Correct (PAC) learning framework of Valiant (1984). In the recently introduced computable PAC (CPAC) learning framework of Agarwal et al. (2020), both learners and the functions they output are required to be computable. We focus on the case of finite label space and start by proposing a computable version of the Natarajan dimension and showing that it characterizes CPAC learnability in this setting. We further generalize this result by establishing a meta-characterization of CPAC learnability for a certain family of dimensions: computable distinguishers. Distinguishers were defined by Ben-David et al. (1992) as a certain family of embeddings of the label space, with each embedding giving rise to a dimension. It was shown that the finiteness of each such dimension characterizes multiclass PAC learnability for finite label space in the non-computable setting. We show that the corresponding computable dimensions for distinguishers characterize CPAC learning. We conclude our analysis by proving that the DS dimension, which characterizes PAC learnability for infinite label space, cannot be expressed as a distinguisher (even in the case of finite label space).

On the Computability of Multiclass PAC Learning

TL;DR

This paper studies computable multiclass PAC (CPAC) learnability for finite label spaces, introducing computable dimensions and a meta-characterization based on distinguishers. It defines the computable Natarajan dimension and the computable graph dimension , and proves that finiteness of these dimensions is necessary (and, with , sufficient) for CPAC learnability. A broad framework using computable -dimensions unifies several multiclass dimensions and yields a meta-characterization: CPAC learnability iff the corresponding is finite for a distinguisher , with the DS dimension not expressible via such distinguishers. The results clarify the computable multiclass learning landscape, linking computable dimensions to learnability, and show that infinite-label scenarios (DS) require fundamentally different tools. These insights inform the design of computable multiclass learners and illuminate the boundary between computational and information-theoretic learnability.

Abstract

We study the problem of computable multiclass learnability within the Probably Approximately Correct (PAC) learning framework of Valiant (1984). In the recently introduced computable PAC (CPAC) learning framework of Agarwal et al. (2020), both learners and the functions they output are required to be computable. We focus on the case of finite label space and start by proposing a computable version of the Natarajan dimension and showing that it characterizes CPAC learnability in this setting. We further generalize this result by establishing a meta-characterization of CPAC learnability for a certain family of dimensions: computable distinguishers. Distinguishers were defined by Ben-David et al. (1992) as a certain family of embeddings of the label space, with each embedding giving rise to a dimension. It was shown that the finiteness of each such dimension characterizes multiclass PAC learnability for finite label space in the non-computable setting. We show that the corresponding computable dimensions for distinguishers characterize CPAC learning. We conclude our analysis by proving that the DS dimension, which characterizes PAC learnability for infinite label space, cannot be expressed as a distinguisher (even in the case of finite label space).

Paper Structure

This paper contains 22 sections, 16 theorems, 10 equations.

Key Result

Proposition 10

Let $\mathcal{H}$ be RER. If $\mathsf{G}(\mathcal{H})<\infty$ or $\mathsf{N}(\mathcal{H})\log(|\mathcal{Y}|)<\infty$, then $\mathcal{H}$ is properly CPAC learnable in the realizable setting.

Theorems & Definitions (43)

  • Definition 1: Agnostic PAC learnability
  • Definition 2: Proper vs improper learning
  • Definition 3: Computable Representation agarwal2020learnability
  • Definition 4: CPAC Learnability agarwal2020learnability
  • Definition 5: VC dimension vapnik1971uniform
  • Definition 6: Natarajan dimension natarajan1989learning
  • Definition 7: Graph dimension natarajan1989learning
  • Definition 8: Pseudo-cube
  • Definition 9: DS dimension daniely2014optimal
  • Proposition 10
  • ...and 33 more