Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach

Yinan Li; Chicheng Zhang

Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach

Yinan Li, Chicheng Zhang

TL;DR

This work addresses efficient active learning of $d$-dimensional homogeneous halfspaces under the $(A,\alpha)$-Tsybakov noise condition with well-behaved unlabeled distributions. It introduces a nonconvex objective $L_\sigma$ whose approximate first-order stationary points suffice to recover a near-optimal halfspace, and pairs this with a label-efficient active oracle to achieve a label complexity of $\tilde{O}\left(d (\frac{1}{\epsilon})^{\frac{8-6\alpha}{3\alpha-1}}\right)$ for $\alpha\in(\tfrac{1}{3},1]$. The method combines (i) efficient nonconvex optimization (Active-PSGD), (ii) a label-efficient iterate selection that uses gradient estimates, and (iii) a final label-efficient validation to select between $\pm\hat w$, yielding a provable $(\epsilon,\delta)$-PAC guarantee. This advances the state of the art by expanding the eligible noise range and improving label efficiency, narrowing the gap to information-theoretic bounds and outperforming prior efficient active algorithms in the same regime.

Abstract

We study the problem of computationally and label efficient PAC active learning $d$-dimensional halfspaces with Tsybakov Noise~\citep{tsybakov2004optimal} under structured unlabeled data distributions. Inspired by~\cite{diakonikolas2020learning}, we prove that any approximate first-order stationary point of a smooth nonconvex loss function yields a halfspace with a low excess error guarantee. In light of the above structural result, we design a nonconvex optimization-based algorithm with a label complexity of $\tilde{O}(d (\frac{1}ε)^{\frac{8-6α}{3α-1}})$, under the assumption that the Tsybakov noise parameter $α\in (\frac13, 1]$, which narrows down the gap between the label complexities of the previously known efficient passive or active algorithms~\citep{diakonikolas2020polynomial,zhang2021improved} and the information-theoretic lower bound in this setting.

Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach

TL;DR

This work addresses efficient active learning of

-dimensional homogeneous halfspaces under the

-Tsybakov noise condition with well-behaved unlabeled distributions. It introduces a nonconvex objective

whose approximate first-order stationary points suffice to recover a near-optimal halfspace, and pairs this with a label-efficient active oracle to achieve a label complexity of

for

. The method combines (i) efficient nonconvex optimization (Active-PSGD), (ii) a label-efficient iterate selection that uses gradient estimates, and (iii) a final label-efficient validation to select between

, yielding a provable

-PAC guarantee. This advances the state of the art by expanding the eligible noise range and improving label efficiency, narrowing the gap to information-theoretic bounds and outperforming prior efficient active algorithms in the same regime.

Abstract

We study the problem of computationally and label efficient PAC active learning

-dimensional halfspaces with Tsybakov Noise~\citep{tsybakov2004optimal} under structured unlabeled data distributions. Inspired by~\cite{diakonikolas2020learning}, we prove that any approximate first-order stationary point of a smooth nonconvex loss function yields a halfspace with a low excess error guarantee. In light of the above structural result, we design a nonconvex optimization-based algorithm with a label complexity of

, under the assumption that the Tsybakov noise parameter

, which narrows down the gap between the label complexities of the previously known efficient passive or active algorithms~\citep{diakonikolas2020polynomial,zhang2021improved} and the information-theoretic lower bound in this setting.

Paper Structure (23 sections, 22 theorems, 76 equations, 1 table, 3 algorithms)

This paper contains 23 sections, 22 theorems, 76 equations, 1 table, 3 algorithms.

Introduction
Our contributions.
Key idea 1: Computationally efficient non-convex optimization for noise tolerance.
Key idea 2: Label efficient first-order oracle for the non-convex objective.
Key idea 3: Label-efficient classifier selection.
Related Work
Statistical complexity for active learning halfspace under Tsybakov noise condition.
Efficient passive learning halfspaces under Tsybakov noise condition.
Efficient active learning halfspaces under Tsybakov noise condition.
Preliminaries
Algorithm
Efficient non-convex optimization with active label queries (Algorithm \ref{['alg:active-PSGD-finding-stationary-point']})
Label-efficient iterate selection to boost the success probability (lines \ref{['line:rep-start']} to \ref{['line:selection-1-end']})
Label-efficient final iterate selection (lines \ref{['line:selection-2-start']} to \ref{['line:selection-2-end']})
Performance Guarantees
...and 8 more sections

Key Result

Lemma 4

Let $D_X$ be a well behaved distribution, and $D$ satisfies $(A, \alpha)$-TNC. Denote by $L_\sigma(w) = \mathbb{E}_D \left[\phi_\sigma \left(y \frac{\left\langle w,x \right\rangle }{\|w\|_2}\right)\right]$ where $\phi_\sigma$ is softmax loss defined above. Let $w$ be such that $\theta(w, w^*) \in (\

Theorems & Definitions (31)

Definition 1: Tsybakov noise condition
Definition 2: Well-behaved distributions diakonikolas2020polynomial
Definition 3
Lemma 4
Lemma 5
Remark 6
Remark 7
Remark 8
Lemma 9
Lemma 10
...and 21 more

Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach

TL;DR

Abstract

Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (31)