Learning smooth functions in high dimensions: from sparse polynomials to deep neural networks

Ben Adcock; Simone Brugiapaglia; Nick Dexter; Sebastian Moraga

Learning smooth functions in high dimensions: from sparse polynomials to deep neural networks

Ben Adcock, Simone Brugiapaglia, Nick Dexter, Sebastian Moraga

TL;DR

The article surveys the problem of learning smooth, high-dimensional target functions from limited data, focusing on infinite-dimensional holomorphic function classes relevant to parametric PDEs and UQ. It contrasts sparse polynomial methods with deep neural networks, deriving near-optimal learnability rates $\mathcal{O}\bigl((m/\log^4 m)^{1/2-1/p}\bigr)$ for $f\in\mathcal{H}(p,\mathsf{M})$ and developing a practical existence theory where trained DNNs achieve similar performance by emulating polynomial approximants. A central theme is addressing unknown anisotropy through weighted sparsity and anchored/lower index sets, enabling dimension-independent convergence without prior knowledge of $\bm{b}$. The practical existence framework then ties theory to practice by proposing architectures and training schemes that approximate the near-optimal rates while remaining robust to measurement and discretization errors, thereby narrowing the theory-practice gap in data-scarce, high-dimensional settings.

Abstract

Learning approximations to smooth target functions of many variables from finite sets of pointwise samples is an important task in scientific computing and its many applications in computational science and engineering. Despite well over half a century of research on high-dimensional approximation, this remains a challenging problem. Yet, significant advances have been made in the last decade towards efficient methods for doing this, commencing with so-called sparse polynomial approximation methods and continuing most recently with methods based on Deep Neural Networks (DNNs). In tandem, there have been substantial advances in the relevant approximation theory and analysis of these techniques. In this work, we survey this recent progress. We describe the contemporary motivations for this problem, which stem from parametric models and computational uncertainty quantification; the relevant function classes, namely, classes of infinite-dimensional, Banach-valued, holomorphic functions; fundamental limits of learnability from finite data for these classes; and finally, sparse polynomial and DNN methods for efficiently learning such functions from finite data. For the latter, there is currently a significant gap between the approximation theory of DNNs and the practical performance of deep learning. Aiming to narrow this gap, we develop the topic of practical existence theory, which asserts the existence of dimension-independent DNN architectures and training strategies that achieve provably near-optimal generalization errors in terms of the amount of training data.

Learning smooth functions in high dimensions: from sparse polynomials to deep neural networks

TL;DR

for

and developing a practical existence theory where trained DNNs achieve similar performance by emulating polynomial approximants. A central theme is addressing unknown anisotropy through weighted sparsity and anchored/lower index sets, enabling dimension-independent convergence without prior knowledge of

. The practical existence framework then ties theory to practice by proposing architectures and training schemes that approximate the near-optimal rates while remaining robust to measurement and discretization errors, thereby narrowing the theory-practice gap in data-scarce, high-dimensional settings.

Abstract

Paper Structure (45 sections, 8 theorems, 117 equations, 1 figure)

This paper contains 45 sections, 8 theorems, 117 equations, 1 figure.

Introduction
Motivations and challenges
Overview
Further literature
Problem statement and notation
Holomorphic functions of infinitely many variables
$(\bm{b},\varepsilon)$-holomorphic functions
Holomorphy and parametric DEs
Known and unknown anisotropy
$\ell^p$-summability and the $\mathcal{H}(p)$ and $\mathcal{H}(p,\mathsf{M})$ classes
Best $s$-term polynomial approximation
Orthogonal polynomials
Orthogonal polynomial expansions
Best $s$-term polynomial approximation
Rates of best $s$-term polynomial approximation
...and 30 more sections

Key Result

Theorem 4.1

Let $\bm{b} \in [0,\infty)^{\mathbb{N}}$ be such that $\bm{b} \in \ell^p(\mathbb{N})$ for some $0 < p < 1$. Then for any $s \in \mathbb{N}$ and $p \leq q \leq 2$, there exists a set $S \subset \mathcal{F}$ with $|S| \leq s$ such that for all $f \in \mathcal{H}(\bm{b})$ with coefficients $\bm{c}$ as in f-coeff.

Figures (1)

Figure 1: Best $s$-term approximation error in the $L^2_{\varrho}(\mathcal{U})$-norm for \ref{['f-numerics']} with $\delta_i = i^{3/2}$. This figure also shows the exponential rate "exp. rate", defined as $C_{\mathsf{exp}} \cdot \exp \left ( - \left ( s d! \prod^{d}_{i=1} \log(\rho_i) \right )^{1/d} \right )$, where $\rho_i$ is such that $(\rho_i+1/\rho_i)/2 = 1+\delta_i$, and the algebraic rate "alg. rate", defined as $C_{\mathsf{alg}} \cdot s^{-1}$. The constants $C_{\mathsf{exp}}$ and $C_{\mathsf{alg}}$ are chosen empirically to aid visualization.

Theorems & Definitions (24)

remark 1: Other measures and domains
remark 2: Error metric
Definition 3.1: Holomorphy
Definition 3.2: Holomorphic extension
Definition 3.3: $(\bm{b},\varepsilon)$-holomorphic functions
remark 3: Functions of finitely many variables
remark 4
Theorem 4.1: Algebraic convergence of the best $s$-term approximation
remark 5: Sharpness of the algebraic rate
Definition 5.1: Adaptive $m$-width
...and 14 more

Learning smooth functions in high dimensions: from sparse polynomials to deep neural networks

TL;DR

Abstract

Learning smooth functions in high dimensions: from sparse polynomials to deep neural networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (24)