Table of Contents
Fetching ...

Learning Intersections of Two Margin Halfspaces under Factorizable Distributions

Ilias Diakonikolas, Mingchen Ma, Lisheng Ren, Christos Tzamos

TL;DR

The paper tackles the problem of learning an intersection of two margin halfspaces under factorizable distributions, a setting where CSQ-based methods are known to incur quasi-polynomial time. It introduces a duality framework based on moment tensors to reveal structure in the marginal $D_V$ and develops both statistically and computationally efficient algorithms that exploit this structure. A key contribution is a polynomial-time SQ-based approach that can efficiently identify directions near the relevant subspace $V$, followed by localization and boosting to obtain a strong learner, while a CSQ lower bound demonstrates intrinsic hardness without using marginal information. The result establishes the first strong separation between CSQ and SQ for weakly realizable PAC learning in this natural class, significantly broadening the tractable regime beyond Gaussian or uniform marginals and offering new tools for moment-tensor analysis and tensor-decomposition in learning.

Abstract

Learning intersections of halfspaces is a central problem in Computational Learning Theory. Even for just two halfspaces, it remains a major open question whether learning is possible in polynomial time with respect to the margin $γ$ of the data points and their dimensionality $d$. The best-known algorithms run in quasi-polynomial time $d^{O(\log(1/γ))}$, and it has been shown that this complexity is unavoidable for any algorithm relying solely on correlational statistical queries (CSQ). In this work, we introduce a novel algorithm that provably circumvents the CSQ hardness barrier. Our approach applies to a broad class of distributions satisfying a natural, previously studied, factorizability assumption. Factorizable distributions lie between distribution-specific and distribution-free settings, and significantly extend previously known tractable cases. Under these distributions, we show that CSQ-based methods still require quasipolynomial time even for weakly learning, whereas our algorithm achieves $poly(d,1/γ)$ time by leveraging more general statistical queries (SQ), establishing a strong separation between CSQ and SQ for this simple realizable PAC learning problem. Our result is grounded in a rigorous analysis utilizing a novel duality framework that characterizes the moment tensor structure induced by the marginal distributions. Building on these structural insights, we propose new, efficient learning algorithms. These algorithms combine a refined variant of Jennrich's Algorithm with PCA over random projections of the moment tensor, along with a gradient-descent-based non-convex optimization framework.

Learning Intersections of Two Margin Halfspaces under Factorizable Distributions

TL;DR

The paper tackles the problem of learning an intersection of two margin halfspaces under factorizable distributions, a setting where CSQ-based methods are known to incur quasi-polynomial time. It introduces a duality framework based on moment tensors to reveal structure in the marginal and develops both statistically and computationally efficient algorithms that exploit this structure. A key contribution is a polynomial-time SQ-based approach that can efficiently identify directions near the relevant subspace , followed by localization and boosting to obtain a strong learner, while a CSQ lower bound demonstrates intrinsic hardness without using marginal information. The result establishes the first strong separation between CSQ and SQ for weakly realizable PAC learning in this natural class, significantly broadening the tractable regime beyond Gaussian or uniform marginals and offering new tools for moment-tensor analysis and tensor-decomposition in learning.

Abstract

Learning intersections of halfspaces is a central problem in Computational Learning Theory. Even for just two halfspaces, it remains a major open question whether learning is possible in polynomial time with respect to the margin of the data points and their dimensionality . The best-known algorithms run in quasi-polynomial time , and it has been shown that this complexity is unavoidable for any algorithm relying solely on correlational statistical queries (CSQ). In this work, we introduce a novel algorithm that provably circumvents the CSQ hardness barrier. Our approach applies to a broad class of distributions satisfying a natural, previously studied, factorizability assumption. Factorizable distributions lie between distribution-specific and distribution-free settings, and significantly extend previously known tractable cases. Under these distributions, we show that CSQ-based methods still require quasipolynomial time even for weakly learning, whereas our algorithm achieves time by leveraging more general statistical queries (SQ), establishing a strong separation between CSQ and SQ for this simple realizable PAC learning problem. Our result is grounded in a rigorous analysis utilizing a novel duality framework that characterizes the moment tensor structure induced by the marginal distributions. Building on these structural insights, we propose new, efficient learning algorithms. These algorithms combine a refined variant of Jennrich's Algorithm with PCA over random projections of the moment tensor, along with a gradient-descent-based non-convex optimization framework.

Paper Structure

This paper contains 43 sections, 35 theorems, 97 equations, 5 figures, 6 algorithms.

Key Result

Theorem 1.2

Let $\gamma>0$, $q, d \in \mathbb{N}$, $\tau\in (0,1)$ and $d'=\min(d,1/\gamma^2)$. Any CSQ algorithm that learns intersections of two halfspaces with $\gamma$-margin in $d$ dimensions under factorizable distributions to error $1/2-\max(d'^{-\Omega(\log(1/\gamma))},2^{-d'^{\Omega(1)}})$ requires $q$

Figures (5)

  • Figure 1: Geometry of Intersection of Two Halfspaces under \ref{['as parameter']}.
  • Figure 2: Geometrical Illustration of \ref{['as parameter']}. Two halfspaces $h_1=\mathrm{sign}(\mathbf{u}^*\cdot \mathbf{x}+t_1)$ and $h_2=\mathrm{sign}(\mathbf{v}^*\cdot \mathbf{x}+t_2)$ are colored in black. Red dashed lines represent the directions of weight vectors $\mathbf{u}^*,\mathbf{v}^*$.
  • Figure 3: Geometrical illustration for the proof of \ref{['lm 2nd']}. The vector colored in purple corresponds to case 1 in the proof and the vector colored in green corresponds to case 2 in the proof.
  • Figure 4: Illustration for \ref{['lm positive polynomial']}. The target intersection of two halfspace $h^*$ is plotted in black. Colored lines represent the contours of the polynomial $f^*$. $f^*(\mathbf{x})>0$ for every example $\mathbf{x}$ labeled positive by $h^*$.
  • Figure 5: Illustration for \ref{['lm negative polynomial']}. The target intersection of two halfspace $h^*$ is plotted in black. $h^*$ is symmetric according to the red dashed line $\mathbf{x}_2= -\sigma t \tan\theta/2$. The red dashed line partitions the region of negative examples into two regions $N_1:=\{\mathbf{x} \in V \mid \mathbf{x}_2 \ge -\sigma t \tan\theta/2, \mathbf{u}^*\cdot \mathbf{x}+ t_1 \le 0\}$ and $N_2:=\{\mathbf{x} \in V \mid \mathbf{x}_2 \le -\sigma t \tan\theta/2, \mathbf{v}^*\cdot \mathbf{x}+ t_2 \le 0\}$. Colored lines represent the contours of the polynomial $f^*$. $f^*(\mathbf{x})<0$ for every example $\mathbf{x}$ labeled negative by $h^*$.

Theorems & Definitions (70)

  • Definition 1.1: Learning Intersections of Margin Halfspaces Under Factorizable Distributions
  • Theorem 1.2: CSQ Lower Bound
  • Theorem 1.3: Main Result
  • Theorem 1.3: Main Result
  • Theorem 1.4: Informal statement of \ref{['th:intersection-structure-main']}
  • Definition 2.1: $(\alpha,m)$-moment matching condition
  • Theorem 2.2
  • Theorem 2.3: Polynomial One-sided Approximation
  • Lemma 1
  • Lemma 2
  • ...and 60 more