Robustly Learning Single-Index Models via Alignment Sharpness
Nikos Zarifis, Puqian Wang, Ilias Diakonikolas, Jelena Diakonikolas
TL;DR
The paper tackles the problem of learning single-index models under the squared loss in the agnostic setting with unknown link functions. It introduces alignment sharpness, a local-error-bound notion for a convex surrogate loss, and develops a computationally efficient algorithm that achieves a universal constant-factor approximation to the best possible $L_2^2$ loss. The key ideas are to select best-fit activations along a projected direction and to leverage a gradient-alignment guarantee that contracts misalignment between the estimated and true directions, enabling a linear-rate convergence. The results hold under mild distributional assumptions (the well-behaved class) and for broad activation families $igl( ext{a}, ext{b}igr)$-unbounded, including ReLU-like functions, providing the first polynomial-time constant-factor agnostic learner for Gaussian marginals and unknown link functions. The work thus advances practical agnostic learning for SIMs and suggests broader applicability of alignment-based analysis in optimization.
Abstract
We study the problem of learning Single-Index Models under the $L_2^2$ loss in the agnostic model. We give an efficient learning algorithm, achieving a constant factor approximation to the optimal loss, that succeeds under a range of distributions (including log-concave distributions) and a broad class of monotone and Lipschitz link functions. This is the first efficient constant factor approximate agnostic learner, even for Gaussian data and for any nontrivial class of link functions. Prior work for the case of unknown link function either works in the realizable setting or does not attain constant factor approximation. The main technical ingredient enabling our algorithm and analysis is a novel notion of a local error bound in optimization that we term alignment sharpness and that may be of broader interest.
