Table of Contents
Fetching ...

Agnostic learning in (almost) optimal time via Gaussian surface area

Lucas Pesenti, Lucas Slot, Manuel Wiedmer

TL;DR

The proof relies on a direct analogue of a construction of Feldman et al. (2020), who considered $L_1$-approximation on the Boolean hypercube, and yields (near) optimal bounds on the complexity of agnostically learning polynomial threshold functions in the statistical query model.

Abstract

The complexity of learning a concept class under Gaussian marginals in the difficult agnostic model is closely related to its $L_1$-approximability by low-degree polynomials. For any concept class with Gaussian surface area at most $Γ$, Klivans et al. (2008) show that degree $d = O(Γ^2 / \varepsilon^4)$ suffices to achieve an $\varepsilon$-approximation. This leads to the best-known bounds on the complexity of learning a variety of concept classes. In this note, we improve their analysis by showing that degree $d = \tilde O (Γ^2 / \varepsilon^2)$ is enough. In light of lower bounds due to Diakonikolas et al. (2021), this yields (near) optimal bounds on the complexity of agnostically learning polynomial threshold functions in the statistical query model. Our proof relies on a direct analogue of a construction of Feldman et al. (2020), who considered $L_1$-approximation on the Boolean hypercube.

Agnostic learning in (almost) optimal time via Gaussian surface area

TL;DR

The proof relies on a direct analogue of a construction of Feldman et al. (2020), who considered -approximation on the Boolean hypercube, and yields (near) optimal bounds on the complexity of agnostically learning polynomial threshold functions in the statistical query model.

Abstract

The complexity of learning a concept class under Gaussian marginals in the difficult agnostic model is closely related to its -approximability by low-degree polynomials. For any concept class with Gaussian surface area at most , Klivans et al. (2008) show that degree suffices to achieve an -approximation. This leads to the best-known bounds on the complexity of learning a variety of concept classes. In this note, we improve their analysis by showing that degree is enough. In light of lower bounds due to Diakonikolas et al. (2021), this yields (near) optimal bounds on the complexity of agnostically learning polynomial threshold functions in the statistical query model. Our proof relies on a direct analogue of a construction of Feldman et al. (2020), who considered -approximation on the Boolean hypercube.
Paper Structure (18 sections, 18 theorems, 92 equations, 1 table)

This paper contains 18 sections, 18 theorems, 92 equations, 1 table.

Key Result

Theorem 1.1

Let $f: \mathbb{R}^n \to \{\pm 1\}$ be a (measurable) function. For every $\varepsilon > 0$, there exists a polynomial $p$ of degree $d \leq O(\log(1/\varepsilon) \cdot \mathrm{GSA}(f)^2/\varepsilon^2)$ with $\mathbb{E}_{x \sim \mathcal{N}_n}\left[ {|f(x) - p(x)|} \right]\leq \varepsilon$.

Theorems & Definitions (36)

  • Theorem 1.1
  • Proposition 1.1
  • Corollary 1.2
  • Theorem 1.3: Diakonikolas2021
  • Lemma 2.1
  • proof
  • Definition 2.2: cf. KlivansODonnellServedio:Gaussiansurfacearea
  • Lemma 2.3: see \ref{['SEC:prooffTrhof']}
  • Definition 2.4: cf. KlivansODonnellServedio:Gaussiansurfacearea
  • Lemma 2.5: KlivansODonnellServedio:Gaussiansurfacearea
  • ...and 26 more