The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem

Guy Blanc; Alexandre Hayderi; Caleb Koch; Li-Yang Tan

The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem

Guy Blanc, Alexandre Hayderi, Caleb Koch, Li-Yang Tan

TL;DR

The sample complexity of smooth boosting is studied and a class that can be weak learned to $\gamma$ is exhibited, for which strong learning over the uniform distribution requires $\tilde{\Omega}(1/\gamma^{2}{)}\cdot m$ samples.

Abstract

Smooth boosters generate distributions that do not place too much weight on any given example. Originally introduced for their noise-tolerant properties, such boosters have also found applications in differential privacy, reproducibility, and quantum learning theory. We study and settle the sample complexity of smooth boosting: we exhibit a class that can be weak learned to $γ$-advantage over smooth distributions with $m$ samples, for which strong learning over the uniform distribution requires $\tildeΩ(1/γ^2)\cdot m$ samples. This matches the overhead of existing smooth boosters and provides the first separation from the setting of distribution-independent boosting, for which the corresponding overhead is $O(1/γ)$. Our work also sheds new light on Impagliazzo's hardcore theorem from complexity theory, all known proofs of which can be cast in the framework of smooth boosting. For a function $f$ that is mildly hard against size-$s$ circuits, the hardcore theorem provides a set of inputs on which $f$ is extremely hard against size-$s'$ circuits. A downside of this important result is the loss in circuit size, i.e. that $s' \ll s$. Answering a question of Trevisan, we show that this size loss is necessary and in fact, the parameters achieved by known proofs are the best possible.

The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem

TL;DR

The sample complexity of smooth boosting is studied and a class that can be weak learned to

is exhibited, for which strong learning over the uniform distribution requires

samples.

Abstract

-advantage over smooth distributions with

samples, for which strong learning over the uniform distribution requires

samples. This matches the overhead of existing smooth boosters and provides the first separation from the setting of distribution-independent boosting, for which the corresponding overhead is

. Our work also sheds new light on Impagliazzo's hardcore theorem from complexity theory, all known proofs of which can be cast in the framework of smooth boosting. For a function

that is mildly hard against size-

circuits, the hardcore theorem provides a set of inputs on which

is extremely hard against size-

circuits. A downside of this important result is the loss in circuit size, i.e. that

. Answering a question of Trevisan, we show that this size loss is necessary and in fact, the parameters achieved by known proofs are the best possible.

Paper Structure (44 sections, 27 theorems, 141 equations, 1 figure)

This paper contains 44 sections, 27 theorems, 141 equations, 1 figure.

Introduction
Smooth boosting.
This work
First result: The sample complexity of smooth boosting
Separating the sample complexities of smooth and distribution-independent boosting.
Second result: Tightness of Impagliazzo's hardcore theorem
Size loss and smooth boosting.
Relationship between \ref{['thm:smooth intro', 'thm:impagliazzo intro']}.
Other related work
Proof overview for \ref{['thm:impagliazzo intro']}: Tightness of the hardcore theorem
Tightness of the hardcore theorem for junta complexity
Lifting junta complexity to circuit complexity
Soft junta complexity
\ref{['lem:lower-bound-from-soft']}: Lower bound in terms of soft junta complexity
\ref{['lem:connect-soft-and-hard-juntas']}: Relating soft and standard junta complexity
...and 29 more sections

Key Result

Theorem 1

For any sample size $m$ and parameter $\gamma$, there exists a concept class $\mathcal{C}$ such that:

Figures (1)

Figure 1: Our algorithm for weak learning $\mathcal{C}$.

Theorems & Definitions (83)

Theorem 1
Remark 2.1: A computational-statistical gap for distribution-independent boosting?
Theorem 2
Definition 1: Junta complexity
Claim 3.1: Tightness of the hardcore theorem for juntas
Definition 2: Lifted class
Theorem 3: Lifting junta complexity to circuit complexity
Remark 3.1: Contrast with Uhlig's mass production theorem
Definition 3: $\alpha$-correlated-distance and error
Definition 4: Soft junta complexity
...and 73 more

The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem

TL;DR

Abstract

The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (83)