Table of Contents
Fetching ...

Agnostic Active Learning of Single Index Models with Linear Sample Complexity

Aarshvi Gajjar, Wai Ming Tai, Xingyu Xu, Chinmay Hegde, Yi Li, Christopher Musco

TL;DR

This work addresses agnostic active learning for single index models of the form $h(oldsymbol{x})=f(oldsymbol{x}^ opoldsymbol{w})$ under adversarial noise. It introduces a leverage-score sampling framework that yields near-linear sample complexity in the ambient dimension, achieving $ ilde{O}(d)$ labeled samples when $f$ is known and Lipschitz, and extending to unknown Lipschitz $f$ with only a logarithmic factor in $n$. The key technical contributions are nonlinear subspace embeddings for Lipschitz nonlinearities, a distribution-aware discretization of Lip$_L$, and the use of Dudley’s integral together with dual Sudakov minoration to obtain tight, distribution-free concentration bounds. These results significantly improve prior bounds, provide robust and scalable guarantees for PDE surrogate modeling and related scientific ML tasks, and open directions for computation-focused analyses and multi-index generalizations.

Abstract

We study active learning methods for single index models of the form $F({\mathbf x}) = f(\langle {\mathbf w}, {\mathbf x}\rangle)$, where $f:\mathbb{R} \to \mathbb{R}$ and ${\mathbf x,\mathbf w} \in \mathbb{R}^d$. In addition to their theoretical interest as simple examples of non-linear neural networks, single index models have received significant recent attention due to applications in scientific machine learning like surrogate modeling for partial differential equations (PDEs). Such applications require sample-efficient active learning methods that are robust to adversarial noise. I.e., that work even in the challenging agnostic learning setting. We provide two main results on agnostic active learning of single index models. First, when $f$ is known and Lipschitz, we show that $\tilde{O}(d)$ samples collected via {statistical leverage score sampling} are sufficient to learn a near-optimal single index model. Leverage score sampling is simple to implement, efficient, and already widely used for actively learning linear models. Our result requires no assumptions on the data distribution, is optimal up to log factors, and improves quadratically on a recent ${O}(d^{2})$ bound of \cite{gajjar2023active}. Second, we show that $\tilde{O}(d)$ samples suffice even in the more difficult setting when $f$ is \emph{unknown}. Our results leverage tools from high dimensional probability, including Dudley's inequality and dual Sudakov minoration, as well as a novel, distribution-aware discretization of the class of Lipschitz functions.

Agnostic Active Learning of Single Index Models with Linear Sample Complexity

TL;DR

This work addresses agnostic active learning for single index models of the form under adversarial noise. It introduces a leverage-score sampling framework that yields near-linear sample complexity in the ambient dimension, achieving labeled samples when is known and Lipschitz, and extending to unknown Lipschitz with only a logarithmic factor in . The key technical contributions are nonlinear subspace embeddings for Lipschitz nonlinearities, a distribution-aware discretization of Lip, and the use of Dudley’s integral together with dual Sudakov minoration to obtain tight, distribution-free concentration bounds. These results significantly improve prior bounds, provide robust and scalable guarantees for PDE surrogate modeling and related scientific ML tasks, and open directions for computation-focused analyses and multi-index generalizations.

Abstract

We study active learning methods for single index models of the form , where and . In addition to their theoretical interest as simple examples of non-linear neural networks, single index models have received significant recent attention due to applications in scientific machine learning like surrogate modeling for partial differential equations (PDEs). Such applications require sample-efficient active learning methods that are robust to adversarial noise. I.e., that work even in the challenging agnostic learning setting. We provide two main results on agnostic active learning of single index models. First, when is known and Lipschitz, we show that samples collected via {statistical leverage score sampling} are sufficient to learn a near-optimal single index model. Leverage score sampling is simple to implement, efficient, and already widely used for actively learning linear models. Our result requires no assumptions on the data distribution, is optimal up to log factors, and improves quadratically on a recent bound of \cite{gajjar2023active}. Second, we show that samples suffice even in the more difficult setting when is \emph{unknown}. Our results leverage tools from high dimensional probability, including Dudley's inequality and dual Sudakov minoration, as well as a novel, distribution-aware discretization of the class of Lipschitz functions.
Paper Structure (53 sections, 26 theorems, 164 equations)

This paper contains 53 sections, 26 theorems, 164 equations.

Key Result

Theorem 1

Let $f$ be a fixed $L$-Lipschitz function, let $\bm{X}\in \mathbb{R}^{n\times d}$ be a data matrix, and let $\bm{w}^\star = \arg\min_{\bm{w}} \|f(\bm{X} {\bm{w}}) - \bm{y}\|_2^2$. There is an algorithm that, for any $\varepsilon \in (0,1)$, observes $\tilde{O}\left(d^2\cdot \frac{L^8}{\varepsilon^4} with high probability. Above, $f(\bm{X} {\bm{w}})$ denotes the entrywise application of $f$ to the

Theorems & Definitions (37)

  • Theorem 1: Theorem 1 from gajjar2023active
  • Theorem 2
  • Theorem 3
  • Definition 4: $\varepsilon$-accurate solution
  • Definition 5: Statistical leverage score
  • Lemma 6: Non-linear subspace embedding with fixed non-linearity
  • Lemma 7: Subspace embedding
  • proof
  • Lemma 8: Non-linear subspace embedding with unknown non-linearity
  • proof
  • ...and 27 more