Smoothed Agnostic Learning of Halfspaces over the Hypercube
Yiwen Kou, Raghu Meka
TL;DR
The paper introduces a discrete analogue of smoothed agnostic learning for Boolean halfspaces on the hypercube, modeling perturbations via independent bit flips with probability $\sigma$. It develops a reduction to low-degree polynomial approximation using an $L_1$-polynomial regression framework adapted to a smoothed setting, leveraging a rerandomization trick and Berry–Esseen-type analysis alongside a critical-index decomposition to handle irregular weight vectors. Under strictly sub-exponential marginals, the authors prove an efficient learning guarantee with sample and runtime that scale polynomially in $n$ and $1/(\sigma\epsilon)$, establishing the first computationally efficient smoothed agnostic guarantee for halfspaces on $\{\pm1\}^n$. The approach extends smoothed learning ideas to discrete domains, bridging worst-case intractability and practical learnability, and opens avenues for further work on multi-halfspace intersections and broader Boolean function classes.
Abstract
Agnostic learning of Boolean halfspaces is a fundamental problem in computational learning theory, but it is known to be computationally hard even for weak learning. Recent work [CKKMK24] proposed smoothed analysis as a way to bypass such hardness, but existing frameworks rely on additive Gaussian perturbations, making them unsuitable for discrete domains. We introduce a new smoothed agnostic learning framework for Boolean inputs, where perturbations are modeled via random bit flips. This defines a natural discrete analogue of smoothed optimality generalizing the Gaussian case. Under strictly subexponential assumptions on the input distribution, we give an efficient algorithm for learning halfspaces in this model, with runtime and sample complexity approximately n raised to a poly(1/(sigma * epsilon)) factor. Previously, such algorithms were known only with strong structural assumptions for the discrete hypercube, for example, independent coordinates or symmetric distributions. Our result provides the first computationally efficient guarantee for smoothed agnostic learning of halfspaces over the Boolean hypercube, bridging the gap between worst-case intractability and practical learnability in discrete settings.
