Table of Contents
Fetching ...

Omnipredicting Single-Index Models with Multi-Index Models

Lunjia Hu, Kevin Tian, Chutong Yang

TL;DR

The paper addresses omniprediction for single-index models in an agnostic setting, seeking end-to-end, efficient, interpretable predictors. It recasts Isotron convergence through an omnigap framework and introduces Omnitron, a sample-and-runtime-efficient omnipredictor that outputs a multi-index model with about O(ε^-2) heads, achieving ε-omniprediction under monotone Lipschitz link families with near-linear runtimes. In one dimension, the work leverages the Pool-Adjacent-Violators algorithm to obtain a simple, proper omnipredictor for all matching losses, and provides uniform convergence guarantees. It further develops a near-linear-time bounded isotonic regression solver, generalization bounds via Rademacher complexity and chaining, and analyzes the (non)existence of proper omnipredictors, showing existence for constants but not in general for linear predictors. Overall, the results yield practical, interpretable omnipredictors for SIMs and move toward proper omniprediction with strong computational guarantees, enabling robust downstream evaluation across a family of losses.

Abstract

Recent work on supervised learning [GKR+22] defined the notion of omnipredictors, i.e., predictor functions $p$ over features that are simultaneously competitive for minimizing a family of loss functions $\mathcal{L}$ against a comparator class $\mathcal{C}$. Omniprediction requires approximating the Bayes-optimal predictor beyond the loss minimization paradigm, and has generated significant interest in the learning theory community. However, even for basic settings such as agnostically learning single-index models (SIMs), existing omnipredictor constructions require impractically-large sample complexities and runtimes, and output complex, highly-improper hypotheses. Our main contribution is a new, simple construction of omnipredictors for SIMs. We give a learner outputting an omnipredictor that is $\varepsilon$-competitive on any matching loss induced by a monotone, Lipschitz link function, when the comparator class is bounded linear predictors. Our algorithm requires $\approx \varepsilon^{-4}$ samples and runs in nearly-linear time, and its sample complexity improves to $\approx \varepsilon^{-2}$ if link functions are bi-Lipschitz. This significantly improves upon the only prior known construction, due to [HJKRR18, GHK+23], which used $\gtrsim \varepsilon^{-10}$ samples. We achieve our construction via a new, sharp analysis of the classical Isotron algorithm [KS09, KKKS11] in the challenging agnostic learning setting, of potential independent interest. Previously, Isotron was known to properly learn SIMs in the realizable setting, as well as constant-factor competitive hypotheses under the squared loss [ZWDD24]. As they are based on Isotron, our omnipredictors are multi-index models with $\approx \varepsilon^{-2}$ prediction heads, bringing us closer to the tantalizing goal of proper omniprediction for general loss families and comparators.

Omnipredicting Single-Index Models with Multi-Index Models

TL;DR

The paper addresses omniprediction for single-index models in an agnostic setting, seeking end-to-end, efficient, interpretable predictors. It recasts Isotron convergence through an omnigap framework and introduces Omnitron, a sample-and-runtime-efficient omnipredictor that outputs a multi-index model with about O(ε^-2) heads, achieving ε-omniprediction under monotone Lipschitz link families with near-linear runtimes. In one dimension, the work leverages the Pool-Adjacent-Violators algorithm to obtain a simple, proper omnipredictor for all matching losses, and provides uniform convergence guarantees. It further develops a near-linear-time bounded isotonic regression solver, generalization bounds via Rademacher complexity and chaining, and analyzes the (non)existence of proper omnipredictors, showing existence for constants but not in general for linear predictors. Overall, the results yield practical, interpretable omnipredictors for SIMs and move toward proper omniprediction with strong computational guarantees, enabling robust downstream evaluation across a family of losses.

Abstract

Recent work on supervised learning [GKR+22] defined the notion of omnipredictors, i.e., predictor functions over features that are simultaneously competitive for minimizing a family of loss functions against a comparator class . Omniprediction requires approximating the Bayes-optimal predictor beyond the loss minimization paradigm, and has generated significant interest in the learning theory community. However, even for basic settings such as agnostically learning single-index models (SIMs), existing omnipredictor constructions require impractically-large sample complexities and runtimes, and output complex, highly-improper hypotheses. Our main contribution is a new, simple construction of omnipredictors for SIMs. We give a learner outputting an omnipredictor that is -competitive on any matching loss induced by a monotone, Lipschitz link function, when the comparator class is bounded linear predictors. Our algorithm requires samples and runs in nearly-linear time, and its sample complexity improves to if link functions are bi-Lipschitz. This significantly improves upon the only prior known construction, due to [HJKRR18, GHK+23], which used samples. We achieve our construction via a new, sharp analysis of the classical Isotron algorithm [KS09, KKKS11] in the challenging agnostic learning setting, of potential independent interest. Previously, Isotron was known to properly learn SIMs in the realizable setting, as well as constant-factor competitive hypotheses under the squared loss [ZWDD24]. As they are based on Isotron, our omnipredictors are multi-index models with prediction heads, bringing us closer to the tantalizing goal of proper omniprediction for general loss families and comparators.

Paper Structure

This paper contains 51 sections, 56 theorems, 229 equations, 1 figure.

Key Result

Theorem 1

Algorithm alg:ideal_omnitron (the Isotron run for $T = O(\varepsilon^{-2})$ iterations with appropriate post-processing) returns an $\varepsilon$-omnipredictor for SIMs $p$ satisfying eq:omni_sim_intro, where $\mathcal{S}$ is all monotone links $\sigma: [-1, 1] \to [0, 1]$. Moreover, $p(\mathbf{x})

Figures (1)

  • Figure 1: Self-balancing rotation in an AVL tree

Theorems & Definitions (102)

  • Theorem 1: Informal, see Theorem \ref{['thm:ideal_omni']}
  • Corollary 1: Informal, see Corollary \ref{['cor:omni_erm']}
  • Theorem 2: Informal, see Theorem \ref{['thm:fs']}
  • Corollary 2: Informal, see Corollary \ref{['cor:fs_anti_lip']}
  • Theorem 3: Informal, see Theorem \ref{['thm:pav-population']}
  • Corollary 3: Informal, see \ref{['cor:1d-sim']}
  • Definition 1: Matching loss
  • Lemma 1
  • proof
  • Lemma 2
  • ...and 92 more