Active Learning of General Halfspaces: Label Queries vs Membership Queries
Ilias Diakonikolas, Daniel M. Kane, Mingchen Ma
TL;DR
This work analyzes learning general halfspaces under Gaussian marginals, contrasting pool-based active learning with label queries against membership-query learning. It proves a tight lower bound showing label-query active learning offers no nontrivial gains without exponentially large unlabeled pools, and simultaneously presents a computationally efficient agnostic learner using membership queries that achieves $ ext{err}(\\hat{h}) \le O(\text{opt})+\\epsilon$ with query complexity $M=\\tilde{O}_\\delta(\\min\{1/p,1/\\epsilon\} + d\\cdot\\text{polylog}(1/\\epsilon))$ and polynomial runtime. The algorithm proceeds via a warm-start phase followed by localization-based refinement, employing gradient-like updates and a robust treatment of label noise through rejection sampling and smoothed labels. A corollary of these results is a strong separation between active learning with label queries and learning with membership queries in this Gaussian halfspace setting. Collectively, the findings map out the complexity landscape for general-halfspace learning under Gaussian marginals across these two interactive data-access models and establish near-optimal bounds in the agnostic regime.
Abstract
We study the problem of learning general (i.e., not necessarily homogeneous) halfspaces under the Gaussian distribution on $R^d$ in the presence of some form of query access. In the classical pool-based active learning model, where the algorithm is allowed to make adaptive label queries to previously sampled points, we establish a strong information-theoretic lower bound ruling out non-trivial improvements over the passive setting. Specifically, we show that any active learner requires label complexity of $\tildeΩ(d/(\log(m)ε))$, where $m$ is the number of unlabeled examples. Specifically, to beat the passive label complexity of $\tilde{O} (d/ε)$, an active learner requires a pool of $2^{poly(d)}$ unlabeled samples. On the positive side, we show that this lower bound can be circumvented with membership query access, even in the agnostic model. Specifically, we give a computationally efficient learner with query complexity of $\tilde{O}(\min\{1/p, 1/ε\} + d\cdot polylog(1/ε))$ achieving error guarantee of $O(opt)+ε$. Here $p \in [0, 1/2]$ is the bias and $opt$ is the 0-1 loss of the optimal halfspace. As a corollary, we obtain a strong separation between the active and membership query models. Taken together, our results characterize the complexity of learning general halfspaces under Gaussian marginals in these models.
