Table of Contents
Fetching ...

The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem

Guy Blanc, Alexandre Hayderi, Caleb Koch, Li-Yang Tan

TL;DR

The sample complexity of smooth boosting is studied and a class that can be weak learned to $\gamma$ is exhibited, for which strong learning over the uniform distribution requires $\tilde{\Omega}(1/\gamma^{2}{)}\cdot m$ samples.

Abstract

Smooth boosters generate distributions that do not place too much weight on any given example. Originally introduced for their noise-tolerant properties, such boosters have also found applications in differential privacy, reproducibility, and quantum learning theory. We study and settle the sample complexity of smooth boosting: we exhibit a class that can be weak learned to $γ$-advantage over smooth distributions with $m$ samples, for which strong learning over the uniform distribution requires $\tildeΩ(1/γ^2)\cdot m$ samples. This matches the overhead of existing smooth boosters and provides the first separation from the setting of distribution-independent boosting, for which the corresponding overhead is $O(1/γ)$. Our work also sheds new light on Impagliazzo's hardcore theorem from complexity theory, all known proofs of which can be cast in the framework of smooth boosting. For a function $f$ that is mildly hard against size-$s$ circuits, the hardcore theorem provides a set of inputs on which $f$ is extremely hard against size-$s'$ circuits. A downside of this important result is the loss in circuit size, i.e. that $s' \ll s$. Answering a question of Trevisan, we show that this size loss is necessary and in fact, the parameters achieved by known proofs are the best possible.

The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem

TL;DR

The sample complexity of smooth boosting is studied and a class that can be weak learned to is exhibited, for which strong learning over the uniform distribution requires samples.

Abstract

Smooth boosters generate distributions that do not place too much weight on any given example. Originally introduced for their noise-tolerant properties, such boosters have also found applications in differential privacy, reproducibility, and quantum learning theory. We study and settle the sample complexity of smooth boosting: we exhibit a class that can be weak learned to -advantage over smooth distributions with samples, for which strong learning over the uniform distribution requires samples. This matches the overhead of existing smooth boosters and provides the first separation from the setting of distribution-independent boosting, for which the corresponding overhead is . Our work also sheds new light on Impagliazzo's hardcore theorem from complexity theory, all known proofs of which can be cast in the framework of smooth boosting. For a function that is mildly hard against size- circuits, the hardcore theorem provides a set of inputs on which is extremely hard against size- circuits. A downside of this important result is the loss in circuit size, i.e. that . Answering a question of Trevisan, we show that this size loss is necessary and in fact, the parameters achieved by known proofs are the best possible.
Paper Structure (44 sections, 27 theorems, 141 equations, 1 figure)

This paper contains 44 sections, 27 theorems, 141 equations, 1 figure.

Key Result

Theorem 1

For any sample size $m$ and parameter $\gamma$, there exists a concept class $\mathcal{C}$ such that:

Figures (1)

  • Figure 1: Our algorithm for weak learning $\mathcal{C}$.

Theorems & Definitions (83)

  • Theorem 1
  • Remark 2.1: A computational-statistical gap for distribution-independent boosting?
  • Theorem 2
  • Definition 1: Junta complexity
  • Claim 3.1: Tightness of the hardcore theorem for juntas
  • Definition 2: Lifted class
  • Theorem 3: Lifting junta complexity to circuit complexity
  • Remark 3.1: Contrast with Uhlig's mass production theorem
  • Definition 3: $\alpha$-correlated-distance and error
  • Definition 4: Soft junta complexity
  • ...and 73 more