Table of Contents
Fetching ...

A Near-optimal Algorithm for Learning Margin Halfspaces with Massart Noise

Ilias Diakonikolas, Nikos Zarifis

TL;DR

This work tackles PAC learning of $γ$-margin halfspaces under $η$-Massart noise, showing a computationally efficient learner with near-optimal sample complexity. The authors introduce a sequence of convex surrogate losses and an online SGD scheme with clipping, achieving $\mathrm{err}_D(\hat{w})\le η+ε$ using $n=\tilde{O}\left(1/(ε^2 γ^2)\right)$ samples and running in $\tilde{O}(d n/ε)$ time. This nearly matches the information-theoretic lower bounds for the computational model and improves upon previous efficient algorithms that required $\tilde{O}\left(1/(γ^4 ε^3)\right)$ samples. The approach is simple, practical, and provides insight into information-computation tradeoffs in Massart-noise settings, with potential extensions to general halfspaces and dimension-efficient implementations.

Abstract

We study the problem of PAC learning $γ$-margin halfspaces in the presence of Massart noise. Without computational considerations, the sample complexity of this learning problem is known to be $\widetildeΘ(1/(γ^2 ε))$. Prior computationally efficient algorithms for the problem incur sample complexity $\tilde{O}(1/(γ^4 ε^3))$ and achieve 0-1 error of $η+ε$, where $η<1/2$ is the upper bound on the noise rate. Recent work gave evidence of an information-computation tradeoff, suggesting that a quadratic dependence on $1/ε$ is required for computationally efficient algorithms. Our main result is a computationally efficient learner with sample complexity $\widetildeΘ(1/(γ^2 ε^2))$, nearly matching this lower bound. In addition, our algorithm is simple and practical, relying on online SGD on a carefully selected sequence of convex losses.

A Near-optimal Algorithm for Learning Margin Halfspaces with Massart Noise

TL;DR

This work tackles PAC learning of -margin halfspaces under -Massart noise, showing a computationally efficient learner with near-optimal sample complexity. The authors introduce a sequence of convex surrogate losses and an online SGD scheme with clipping, achieving using samples and running in time. This nearly matches the information-theoretic lower bounds for the computational model and improves upon previous efficient algorithms that required samples. The approach is simple, practical, and provides insight into information-computation tradeoffs in Massart-noise settings, with potential extensions to general halfspaces and dimension-efficient implementations.

Abstract

We study the problem of PAC learning -margin halfspaces in the presence of Massart noise. Without computational considerations, the sample complexity of this learning problem is known to be . Prior computationally efficient algorithms for the problem incur sample complexity and achieve 0-1 error of , where is the upper bound on the noise rate. Recent work gave evidence of an information-computation tradeoff, suggesting that a quadratic dependence on is required for computationally efficient algorithms. Our main result is a computationally efficient learner with sample complexity , nearly matching this lower bound. In addition, our algorithm is simple and practical, relying on online SGD on a carefully selected sequence of convex losses.
Paper Structure (16 sections, 7 theorems, 35 equations, 1 algorithm)

This paper contains 16 sections, 7 theorems, 35 equations, 1 algorithm.

Key Result

Theorem 1.3

Let $D$ be a distribution on $\mathbb{S}^{d-1} \times \{\pm 1\}$ that satisfies the $\eta$-Massart noise condition with respect to an unknown $\gamma$-margin halfspace $f(\mathbf{x}) = \mathrm{sign}({\mathbf{w}}^{\ast}\cdot\mathbf{x})$. There is algorithm that draws $n = \tilde{O} (1/(\epsilon^2 \ga

Theorems & Definitions (26)

  • Definition 1.1: PAC Learning with Massart Noise
  • Definition 1.2: $\gamma$-Margin Halfspaces
  • Theorem 1.3: Main Result, Informal
  • Theorem 2.1: Main Result
  • Lemma 2.2: Structural Lemma
  • proof
  • Claim 2.3
  • Claim 2.4
  • proof : Proof of \ref{['thm:main-detailed']}
  • Claim 2.5
  • ...and 16 more