Table of Contents
Fetching ...

Learning single-index models via harmonic decomposition

Nirmit Joshi, Hugo Koubbi, Theodor Misiakiewicz, Nathan Srebro

TL;DR

This work introduces spherical single-index models to study learning under general spherically symmetric inputs, arguing that rotational symmetry makes spherical harmonics the natural basis and Gegenbauer expansions the right tool for analysis. It establishes decoupled, per-harmonic-subspace lower and upper bounds, and provides two complementary estimators: spectral methods for low degrees ($\ell=1,2$) that are sample- or runtime-optimal, and harmonic tensor unfolding or online SGD for higher degrees ($\ell\ge3$) that achieve the corresponding optima. In the Gaussian setting, the theory recovers and clarifies prior results by showing that optimal learnability concentrates in the lowest-frequency subspaces, with the radial component of the input enabling runtime advantages. Overall, the symmetry-driven perspective unifies existing Gaussian SIM results, extends them to arbitrary spherical distributions, and highlights inherent trade-offs in achieving joint optimality across sample and computational resources.

Abstract

We study the problem of learning single-index models, where the label $y \in \mathbb{R}$ depends on the input $\boldsymbol{x} \in \mathbb{R}^d$ only through an unknown one-dimensional projection $\langle \boldsymbol{w}_*,\boldsymbol{x}\rangle$. Prior work has shown that under Gaussian inputs, the statistical and computational complexity of recovering $\boldsymbol{w}_*$ is governed by the Hermite expansion of the link function. In this paper, we propose a new perspective: we argue that $spherical$ $harmonics$ -- rather than $Hermite$ $polynomials$ -- provide the natural basis for this problem, as they capture its intrinsic $rotational$ $symmetry$. Building on this insight, we characterize the complexity of learning single-index models under arbitrary spherically symmetric input distributions. We introduce two families of estimators -- based on tensor unfolding and online SGD -- that respectively achieve either optimal sample complexity or optimal runtime, and argue that estimators achieving both may not exist in general. When specialized to Gaussian inputs, our theory not only recovers and clarifies existing results but also reveals new phenomena that had previously been overlooked.

Learning single-index models via harmonic decomposition

TL;DR

This work introduces spherical single-index models to study learning under general spherically symmetric inputs, arguing that rotational symmetry makes spherical harmonics the natural basis and Gegenbauer expansions the right tool for analysis. It establishes decoupled, per-harmonic-subspace lower and upper bounds, and provides two complementary estimators: spectral methods for low degrees () that are sample- or runtime-optimal, and harmonic tensor unfolding or online SGD for higher degrees () that achieve the corresponding optima. In the Gaussian setting, the theory recovers and clarifies prior results by showing that optimal learnability concentrates in the lowest-frequency subspaces, with the radial component of the input enabling runtime advantages. Overall, the symmetry-driven perspective unifies existing Gaussian SIM results, extends them to arbitrary spherical distributions, and highlights inherent trade-offs in achieving joint optimality across sample and computational resources.

Abstract

We study the problem of learning single-index models, where the label depends on the input only through an unknown one-dimensional projection . Prior work has shown that under Gaussian inputs, the statistical and computational complexity of recovering is governed by the Hermite expansion of the link function. In this paper, we propose a new perspective: we argue that -- rather than -- provide the natural basis for this problem, as they capture its intrinsic . Building on this insight, we characterize the complexity of learning single-index models under arbitrary spherically symmetric input distributions. We introduce two families of estimators -- based on tensor unfolding and online SGD -- that respectively achieve either optimal sample complexity or optimal runtime, and argue that estimators achieving both may not exist in general. When specialized to Gaussian inputs, our theory not only recovers and clarifies existing results but also reveals new phenomena that had previously been overlooked.

Paper Structure

This paper contains 48 sections, 48 theorems, 395 equations, 2 tables, 4 algorithms.

Key Result

Theorem 1

Let $\{\nu_d\}_{d \geq 1}$ be a sequence of spherical SIMs with $\nu_d \in \mathfrak{L}_d$.

Theorems & Definitions (89)

  • Definition 1: Spherical link functions
  • Remark 2.1
  • Theorem 1: General lower bounds
  • Remark 3.1
  • Remark 3.2
  • Remark 3.3: Weak to strong recovery
  • Theorem 2: Spectral algorithm
  • Theorem 3: Online SGD algorithm
  • Theorem 4: Harmonic tensor unfolding
  • Remark 3.4
  • ...and 79 more