Efficient Testable Learning of General Halfspaces with Adversarial Label Noise

Ilias Diakonikolas; Daniel M. Kane; Sihan Liu; Nikos Zarifis

Efficient Testable Learning of General Halfspaces with Adversarial Label Noise

Ilias Diakonikolas, Daniel M. Kane, Sihan Liu, Nikos Zarifis

TL;DR

This work is the first polynomial time tester-learner for general halfspaces that achieves dimension-independent misclassification error and develops a tester-learner such that if the data passes the tester, then one can trust the output of the robust learner on the data.

Abstract

We study the task of testable learning of general -- not necessarily homogeneous -- halfspaces with adversarial label noise with respect to the Gaussian distribution. In the testable learning framework, the goal is to develop a tester-learner such that if the data passes the tester, then one can trust the output of the robust learner on the data.Our main result is the first polynomial time tester-learner for general halfspaces that achieves dimension-independent misclassification error. At the heart of our approach is a new methodology to reduce testable learning of general halfspaces to testable learning of nearly homogeneous halfspaces that may be of broader interest.

Efficient Testable Learning of General Halfspaces with Adversarial Label Noise

TL;DR

Abstract

Paper Structure (20 sections, 11 theorems, 48 equations, 3 figures, 3 algorithms)

This paper contains 20 sections, 11 theorems, 48 equations, 3 figures, 3 algorithms.

Introduction
Some natural attempts and why they fail
Overview of Techniques
Preliminaries
Testable Learning of General Halfspaces to Error Lg
Good Localization Center and its Properties
Finding a Good Localization Center
Putting things together
Additional Remarks Regarding Lg
Testable Learner for Nearly Homogeneous Halfspaces
Omitted Proofs for General to Near-Homogeneous Reduction
Proof of Properties of Good Localization Center (Lg)
Proof of Transformation Error Bound (Lg)
Proof of General Wedge Bound (Lg)
Omitted Proofs for Good Localization Center Search
...and 5 more sections

Key Result

Theorem 1.2

Let $\epsilon, \tau \in (0,1)$ and $\mathcal{C}$ be the class of general halfspaces on $\mathbb{R}^d$. There exists a tester-learner for $\mathcal{C}$ with respect to $\mathcal{N}(\mathbf{0}, \mathbf{I})$ up to $0\text{-}1$ error $\widetilde{O} \left( \sqrt{\mathrm{opt}} \right) + \epsilon$, where $

Figures (3)

Figure 1: When there is no noise, the tail points are at distance at least $t^\ast$ from the origin, since they all lie on the side of the hyperplane not containing the origin. Consequently, the line along their mean $\boldsymbol \mu$ must first intersect the separating hyperplane of the halfspace (crossing $\mathbf{w}$) and then cross the mean vector $\boldsymbol \mu$ of the tail points, regardless of the underlying marginal distribution. If we re-center the distribution at $\mathbf{w}$, the halfspace will then become exactly homogeneous.
Figure 2: The figure illustrates a good localization center $\mathbf{w}$ with respect to the halfspace $h$. $\mathbf{w}$ is $\alpha$-far from the halfspace, and $\Phi(\left\| \mathbf{w} \right\|_{2})$ is still non-trivial (bounded from below by the mass on one side of the halfspace).
Figure 3: Bound $\left\| \mathbf{v} \right\|_{2} - \left\| \boldsymbol{\mu} \right\|_{2}$ via $\mathbf{v}^\ast \cdot (\mathbf{v} - \boldsymbol{\mu})$.

Theorems & Definitions (31)

Definition 1.1: Testable Learning with Adversarial Label Noise RV22a
Theorem 1.2: Testable Learning General Halfspaces under Gaussian Marginals
Definition 2.1: Good Localization Center
Definition 2.2
Lemma 2.3: Localization With A Good Center
Proposition 2.4: Testable Learning of Nearly Homogeneous Halfspaces
Lemma 2.5: Transformation Error
Lemma 2.6: Wedge Bound for General Halfspaces
Proposition 2.7
Definition 2.8: Tail point
...and 21 more

Efficient Testable Learning of General Halfspaces with Adversarial Label Noise

TL;DR

Abstract

Efficient Testable Learning of General Halfspaces with Adversarial Label Noise

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (31)