Learning Intersections of Two Margin Halfspaces under Factorizable Distributions
Ilias Diakonikolas, Mingchen Ma, Lisheng Ren, Christos Tzamos
TL;DR
The paper tackles the problem of learning an intersection of two margin halfspaces under factorizable distributions, a setting where CSQ-based methods are known to incur quasi-polynomial time. It introduces a duality framework based on moment tensors to reveal structure in the marginal $D_V$ and develops both statistically and computationally efficient algorithms that exploit this structure. A key contribution is a polynomial-time SQ-based approach that can efficiently identify directions near the relevant subspace $V$, followed by localization and boosting to obtain a strong learner, while a CSQ lower bound demonstrates intrinsic hardness without using marginal information. The result establishes the first strong separation between CSQ and SQ for weakly realizable PAC learning in this natural class, significantly broadening the tractable regime beyond Gaussian or uniform marginals and offering new tools for moment-tensor analysis and tensor-decomposition in learning.
Abstract
Learning intersections of halfspaces is a central problem in Computational Learning Theory. Even for just two halfspaces, it remains a major open question whether learning is possible in polynomial time with respect to the margin $γ$ of the data points and their dimensionality $d$. The best-known algorithms run in quasi-polynomial time $d^{O(\log(1/γ))}$, and it has been shown that this complexity is unavoidable for any algorithm relying solely on correlational statistical queries (CSQ). In this work, we introduce a novel algorithm that provably circumvents the CSQ hardness barrier. Our approach applies to a broad class of distributions satisfying a natural, previously studied, factorizability assumption. Factorizable distributions lie between distribution-specific and distribution-free settings, and significantly extend previously known tractable cases. Under these distributions, we show that CSQ-based methods still require quasipolynomial time even for weakly learning, whereas our algorithm achieves $poly(d,1/γ)$ time by leveraging more general statistical queries (SQ), establishing a strong separation between CSQ and SQ for this simple realizable PAC learning problem. Our result is grounded in a rigorous analysis utilizing a novel duality framework that characterizes the moment tensor structure induced by the marginal distributions. Building on these structural insights, we propose new, efficient learning algorithms. These algorithms combine a refined variant of Jennrich's Algorithm with PCA over random projections of the moment tensor, along with a gradient-descent-based non-convex optimization framework.
