Active Learning of General Halfspaces: Label Queries vs Membership Queries

Ilias Diakonikolas; Daniel M. Kane; Mingchen Ma

Active Learning of General Halfspaces: Label Queries vs Membership Queries

Ilias Diakonikolas, Daniel M. Kane, Mingchen Ma

TL;DR

This work analyzes learning general halfspaces under Gaussian marginals, contrasting pool-based active learning with label queries against membership-query learning. It proves a tight lower bound showing label-query active learning offers no nontrivial gains without exponentially large unlabeled pools, and simultaneously presents a computationally efficient agnostic learner using membership queries that achieves $ ext{err}(\\hat{h}) \le O(\text{opt})+\\epsilon$ with query complexity $M=\\tilde{O}_\\delta(\\min\{1/p,1/\\epsilon\} + d\\cdot\\text{polylog}(1/\\epsilon))$ and polynomial runtime. The algorithm proceeds via a warm-start phase followed by localization-based refinement, employing gradient-like updates and a robust treatment of label noise through rejection sampling and smoothed labels. A corollary of these results is a strong separation between active learning with label queries and learning with membership queries in this Gaussian halfspace setting. Collectively, the findings map out the complexity landscape for general-halfspace learning under Gaussian marginals across these two interactive data-access models and establish near-optimal bounds in the agnostic regime.

Abstract

We study the problem of learning general (i.e., not necessarily homogeneous) halfspaces under the Gaussian distribution on $R^d$ in the presence of some form of query access. In the classical pool-based active learning model, where the algorithm is allowed to make adaptive label queries to previously sampled points, we establish a strong information-theoretic lower bound ruling out non-trivial improvements over the passive setting. Specifically, we show that any active learner requires label complexity of $\tildeΩ(d/(\log(m)ε))$, where $m$ is the number of unlabeled examples. Specifically, to beat the passive label complexity of $\tilde{O} (d/ε)$, an active learner requires a pool of $2^{poly(d)}$ unlabeled samples. On the positive side, we show that this lower bound can be circumvented with membership query access, even in the agnostic model. Specifically, we give a computationally efficient learner with query complexity of $\tilde{O}(\min\{1/p, 1/ε\} + d\cdot polylog(1/ε))$ achieving error guarantee of $O(opt)+ε$. Here $p \in [0, 1/2]$ is the bias and $opt$ is the 0-1 loss of the optimal halfspace. As a corollary, we obtain a strong separation between the active and membership query models. Taken together, our results characterize the complexity of learning general halfspaces under Gaussian marginals in these models.

Active Learning of General Halfspaces: Label Queries vs Membership Queries

TL;DR

with query complexity

and polynomial runtime. The algorithm proceeds via a warm-start phase followed by localization-based refinement, employing gradient-like updates and a robust treatment of label noise through rejection sampling and smoothed labels. A corollary of these results is a strong separation between active learning with label queries and learning with membership queries in this Gaussian halfspace setting. Collectively, the findings map out the complexity landscape for general-halfspace learning under Gaussian marginals across these two interactive data-access models and establish near-optimal bounds in the agnostic regime.

Abstract

We study the problem of learning general (i.e., not necessarily homogeneous) halfspaces under the Gaussian distribution on

in the presence of some form of query access. In the classical pool-based active learning model, where the algorithm is allowed to make adaptive label queries to previously sampled points, we establish a strong information-theoretic lower bound ruling out non-trivial improvements over the passive setting. Specifically, we show that any active learner requires label complexity of

, where

is the number of unlabeled examples. Specifically, to beat the passive label complexity of

, an active learner requires a pool of

unlabeled samples. On the positive side, we show that this lower bound can be circumvented with membership query access, even in the agnostic model. Specifically, we give a computationally efficient learner with query complexity of

achieving error guarantee of

. Here

is the bias and

is the 0-1 loss of the optimal halfspace. As a corollary, we obtain a strong separation between the active and membership query models. Taken together, our results characterize the complexity of learning general halfspaces under Gaussian marginals in these models.

Paper Structure (27 sections, 18 theorems, 50 equations, 5 algorithms)

This paper contains 27 sections, 18 theorems, 50 equations, 5 algorithms.

Introduction
Computational Complexity vs Error Guarantee
Optimality of Query Complexity
Preliminaries
Basic Notation
Problem Definitions
Lower Bound on Label Complexity: Proof of \ref{['th low']}
Robustly Learning of General Halfspaces with Queries: Proof of \ref{['th main']}
Refining a Warm-Start
Finding a Good Gradient via Localization
Robustness Analysis
Finding a Good Initialization
Omitted Details in \ref{['sec overview']}
Discussion on the Noise Level $\mathrm{opt}$
Approximate Bias Estimation Using Queries
...and 12 more sections

Key Result

Theorem 1.1

For any active learning algorithm $\mathcal{A}$, there is a halfspace $h^*$ that labels $S$ with bias $p$ such that if $\mathcal{A}$ makes less than $\Tilde{O}(d/(p\log(m)))$ label queries over $S$, a set of $m$ i.i.d. points drawn from $N(0,I)$, then with probability at least $2/3$ the halfspace $

Theorems & Definitions (35)

Theorem 1.1: Main Lower Bound
Theorem 1.2: Main Algorithmic Result
Definition 1.3: Learning Halfspaces with Membership Queries
Definition 1.4: Active Learning of Halfspaces with Label Queries
Lemma 2.1
proof
Lemma 2.2
proof
Lemma 2.3
proof
...and 25 more

Active Learning of General Halfspaces: Label Queries vs Membership Queries

TL;DR

Abstract

Active Learning of General Halfspaces: Label Queries vs Membership Queries

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (35)