Table of Contents
Fetching ...

Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness

Moïse Blanchard, Abhishek Shetty, Alexander Rakhlin

TL;DR

It is shown that the generalized smoothness also characterizes private learnability under distributional constraints, which is a nearly complete understanding of learnability under distributional adversaries.

Abstract

Understanding minimal assumptions that enable learning and generalization is perhaps the central question of learning theory. Several celebrated results in statistical learning theory, such as the VC theorem and Littlestone's characterization of online learnability, establish conditions on the hypothesis class that allow for learning under independent data and adversarial data, respectively. Building upon recent work bridging these extremes, we study sequential decision making under distributional adversaries that can adaptively choose data-generating distributions from a fixed family $U$ and ask when such problems are learnable with sample complexity that behaves like the favorable independent case. We provide a near complete characterization of families $U$ that admit learnability in terms of a notion known as generalized smoothness i.e. a distribution family admits VC-dimension-dependent regret bounds for every finite-VC hypothesis class if and only if it is generalized smooth. Further, we give universal algorithms that achieve low regret under any generalized smooth adversary without explicit knowledge of $U$. Finally, when $U$ is known, we provide refined bounds in terms of a combinatorial parameter, the fragmentation number, that captures how many disjoint regions can carry nontrivial mass under $U$. These results provide a nearly complete understanding of learnability under distributional adversaries. In addition, building upon the surprising connection between online learning and differential privacy, we show that the generalized smoothness also characterizes private learnability under distributional constraints.

Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness

TL;DR

It is shown that the generalized smoothness also characterizes private learnability under distributional constraints, which is a nearly complete understanding of learnability under distributional adversaries.

Abstract

Understanding minimal assumptions that enable learning and generalization is perhaps the central question of learning theory. Several celebrated results in statistical learning theory, such as the VC theorem and Littlestone's characterization of online learnability, establish conditions on the hypothesis class that allow for learning under independent data and adversarial data, respectively. Building upon recent work bridging these extremes, we study sequential decision making under distributional adversaries that can adaptively choose data-generating distributions from a fixed family and ask when such problems are learnable with sample complexity that behaves like the favorable independent case. We provide a near complete characterization of families that admit learnability in terms of a notion known as generalized smoothness i.e. a distribution family admits VC-dimension-dependent regret bounds for every finite-VC hypothesis class if and only if it is generalized smooth. Further, we give universal algorithms that achieve low regret under any generalized smooth adversary without explicit knowledge of . Finally, when is known, we provide refined bounds in terms of a combinatorial parameter, the fragmentation number, that captures how many disjoint regions can carry nontrivial mass under . These results provide a nearly complete understanding of learnability under distributional adversaries. In addition, building upon the surprising connection between online learning and differential privacy, we show that the generalized smoothness also characterizes private learnability under distributional constraints.
Paper Structure (23 sections, 21 theorems, 74 equations)

This paper contains 23 sections, 21 theorems, 74 equations.

Key Result

Lemma 1

Let $\mathcal{U}$ be a weakly threshold learnable distribution class on $\mathcal{X}$. Then, $\mu_\mathcal{U}$ is a continuous submeasure.

Theorems & Definitions (31)

  • Definition 1: VC dimension
  • Definition 2: Weak VC Learnability
  • Definition 3: Strong VC Learnability
  • Definition 4: Generalized smoothed distribution class
  • Definition 5: Generalized thresholds
  • Definition 6: Continuous Submeasure
  • Lemma 1
  • Lemma 2
  • Theorem 3
  • Definition 7: Uniform Cover
  • ...and 21 more