A duality framework for analyzing random feature and two-layer neural networks

Hongrui Chen; Jihao Long; Lei Wu

A duality framework for analyzing random feature and two-layer neural networks

Hongrui Chen, Jihao Long, Lei Wu

TL;DR

The paper introduces an information-based complexity–driven duality between approximation and estimation to analyze learning with random-feature models and Barron spaces. It defines an I-complexity measure that tightly characterizes learning in noiseless settings and provides comparable lower bounds in noisy settings, while also enabling upper bounds through duality. By instantiating the framework for the spaces $F_{p,\pi}$ and Barron, the authors derive sharp, spectrum-sensitive results: (i) efficient learning of $F_{p,\pi}$ for $p>1$ beyond kernel regimes, (ii) near-optimal $L^{\infty}$ learning of RKHS via kernel ridge regression, and (iii) a unifying view bridging kernel and neural-network–oriented function classes. The methodology yields concrete rates for random-feature approximations and information-driven lower bounds that scale with kernel eigenvalue decays, offering a promising tool for broader learning analyses across settings.

Abstract

We consider the problem of learning functions within the $\mathcal{F}_{p,π}$ and Barron spaces, which play crucial roles in understanding random feature models (RFMs), two-layer neural networks, as well as kernel methods. Leveraging tools from information-based complexity (IBC), we establish a dual equivalence between approximation and estimation, and then apply it to study the learning of the preceding function spaces. The duality allows us to focus on the more tractable problem between approximation and estimation. To showcase the efficacy of our duality framework, we delve into two important but under-explored problems: 1) Random feature learning beyond kernel regime: We derive sharp bounds for learning $\mathcal{F}_{p,π}$ using RFMs. Notably, the learning is efficient without the curse of dimensionality for $p>1$. This underscores the extended applicability of RFMs beyond the traditional kernel regime, since $\mathcal{F}_{p,π}$ with $p<2$ is strictly larger than the corresponding reproducing kernel Hilbert space (RKHS) where $p=2$. 2) The $L^\infty$ learning of RKHS: We establish sharp, spectrum-dependent characterizations for the convergence of $L^\infty$ learning error in both noiseless and noisy settings. Surprisingly, we show that popular kernel ridge regression can achieve near-optimal performance in $L^\infty$ learning, despite it primarily minimizing square loss. To establish the aforementioned duality, we introduce a type of IBC, termed $I$-complexity, to measure the size of a function class. Notably, $I$-complexity offers a tight characterization of learning in noiseless settings, yields lower bounds comparable to Le Cam's in noisy settings, and is versatile in deriving upper bounds. We believe that our duality framework holds potential for broad application in learning analysis across more scenarios.

A duality framework for analyzing random feature and two-layer neural networks

TL;DR

and Barron, the authors derive sharp, spectrum-sensitive results: (i) efficient learning of

for

beyond kernel regimes, (ii) near-optimal

learning of RKHS via kernel ridge regression, and (iii) a unifying view bridging kernel and neural-network–oriented function classes. The methodology yields concrete rates for random-feature approximations and information-driven lower bounds that scale with kernel eigenvalue decays, offering a promising tool for broader learning analyses across settings.

Abstract

We consider the problem of learning functions within the

and Barron spaces, which play crucial roles in understanding random feature models (RFMs), two-layer neural networks, as well as kernel methods. Leveraging tools from information-based complexity (IBC), we establish a dual equivalence between approximation and estimation, and then apply it to study the learning of the preceding function spaces. The duality allows us to focus on the more tractable problem between approximation and estimation. To showcase the efficacy of our duality framework, we delve into two important but under-explored problems: 1) Random feature learning beyond kernel regime: We derive sharp bounds for learning

using RFMs. Notably, the learning is efficient without the curse of dimensionality for

. This underscores the extended applicability of RFMs beyond the traditional kernel regime, since

with

is strictly larger than the corresponding reproducing kernel Hilbert space (RKHS) where

. 2) The

learning of RKHS: We establish sharp, spectrum-dependent characterizations for the convergence of

learning error in both noiseless and noisy settings. Surprisingly, we show that popular kernel ridge regression can achieve near-optimal performance in

learning, despite it primarily minimizing square loss. To establish the aforementioned duality, we introduce a type of IBC, termed

-complexity, to measure the size of a function class. Notably,

-complexity offers a tight characterization of learning in noiseless settings, yields lower bounds comparable to Le Cam's in noisy settings, and is versatile in deriving upper bounds. We believe that our duality framework holds potential for broad application in learning analysis across more scenarios.

Paper Structure (13 sections, 8 theorems, 37 equations)

This paper contains 13 sections, 8 theorems, 37 equations.

Introduction
Our contributions
Related work
Organization
Preliminaries
Estimation errors and information-based complexity
Minimax errors in the noiseless setting: tight bound
Minimax errors in the noisy setting: lower bound
Deriving upper bounds
The $\mathcal{F}_{p,\pi}$ and Barron spaces
The dual equivalences for $\mathcal{F}_{p,\pi}$ and Barron spaces
The interpolation regime
The non-interpolation regime

Key Result

Theorem 3.3

Under Assumption assumption: metrics, it holds for any $\epsilon \geqslant 0$ that

Theorems & Definitions (18)

Definition 3.2: $I$-complexity
Theorem 3.3
Lemma 3.4
proof : An intuitive proof of Lemma \ref{['lem:dual_core']}
proof : Proof of Theorem \ref{['thm:dual_abstract']}
Proposition 3.5
Proposition 3.6
proof
Definition 4.1
Lemma 4.2
...and 8 more

A duality framework for analyzing random feature and two-layer neural networks

TL;DR

Abstract

A duality framework for analyzing random feature and two-layer neural networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (18)