Table of Contents
Fetching ...

Reliable Learning of Halfspaces under Gaussian Marginals

Ilias Diakonikolas, Lisheng Ren, Nikos Zarifis

TL;DR

A new algorithm for reliable learning of Gaussian halfspaces on $\mathbb{R}^d$ with sample and computational complexity and a Statistical Query lower bound suggesting that the $d^Omega(\log (1/\alpha)$ dependence is best possible.

Abstract

We study the problem of PAC learning halfspaces in the reliable agnostic model of Kalai et al. (2012). The reliable PAC model captures learning scenarios where one type of error is costlier than the others. Our main positive result is a new algorithm for reliable learning of Gaussian halfspaces on $\mathbb{R}^d$ with sample and computational complexity $$d^{O(\log (\min\{1/α, 1/ε\}))}\min (2^{\log(1/ε)^{O(\log (1/α))}},2^{\mathrm{poly}(1/ε)})\;,$$ where $ε$ is the excess error and $α$ is the bias of the optimal halfspace. We complement our upper bound with a Statistical Query lower bound suggesting that the $d^{Ω(\log (1/α))}$ dependence is best possible. Conceptually, our results imply a strong computational separation between reliable agnostic learning and standard agnostic learning of halfspaces in the Gaussian setting.

Reliable Learning of Halfspaces under Gaussian Marginals

TL;DR

A new algorithm for reliable learning of Gaussian halfspaces on with sample and computational complexity and a Statistical Query lower bound suggesting that the dependence is best possible.

Abstract

We study the problem of PAC learning halfspaces in the reliable agnostic model of Kalai et al. (2012). The reliable PAC model captures learning scenarios where one type of error is costlier than the others. Our main positive result is a new algorithm for reliable learning of Gaussian halfspaces on with sample and computational complexity where is the excess error and is the bias of the optimal halfspace. We complement our upper bound with a Statistical Query lower bound suggesting that the dependence is best possible. Conceptually, our results imply a strong computational separation between reliable agnostic learning and standard agnostic learning of halfspaces in the Gaussian setting.

Paper Structure

This paper contains 32 sections, 14 theorems, 75 equations, 3 algorithms.

Key Result

Theorem 1.3

Let $D$ be a joint distribution of $(\mathbf{x},y)$ supported on $\mathbb{R}^d\times \{ \pm 1\}$ with marginal $D_{\mathbf{x}}=\mathcal{N}_d$ and let $\alpha$ be the bias of the optimal halfspace on distribution $D$ (with respect to def:reliable-learning). There is an algorithm that uses $N=d^{O(\lo

Theorems & Definitions (48)

  • Definition 1.1: (Positive) Reliable Learning of Gaussian Halfspaces
  • Definition 1.2: Bias of Boolean Function
  • Theorem 1.3: Main Result
  • Definition 1.4: Reliability Condition
  • Proposition 2.1
  • Lemma 2.2: Correlation with an Orthonormal Polynomial
  • proof : Proof Sketch of \ref{['lem:polynomial-correlation']}
  • Definition 2.3: Hermite Tensor
  • Lemma 2.4
  • proof : Proof Sketch of \ref{['lem:finding-nontrivial-direction']}
  • ...and 38 more