Table of Contents
Fetching ...

Distributionally robust approximation property of neural networks

Mihriban Ceylan, David J. Prömel

TL;DR

The paper addresses distributional uncertainty in neural network function approximation by proving universal approximation theorems in Orlicz spaces and a distributionally robust universal approximation theorem over weakly compact families of finite Borel measures. It shows that networks are dense in the Orlicz heart $M^\varphi(\mu)$ for architectures including bounded-activation nets, ReLU nets, non-polynomial activations, and functional-input nets, with density measured via the gauge norm $N_{\varphi,\mu}$. A robust UAT is proved: for a weakly compact set $\mathcal{M}$ with associated Young pair $(\varphi_{\mathcal{M}},\psi_{\mathcal{M}})$ and target $f$, there exist networks $\eta$ such that $\sup_{\nu\in\mathcal{M}} \|f-\eta\|_{L^1(\nu)}<\varepsilon$ under various activation regimes. By linking Orlicz-space universality to distributional robustness, the work extends classical $L^p$ results and provides a theoretical basis for distributionally robust learning and analysis in settings with ambiguity about data distributions. These results broaden the applicability of neural networks under distributional shifts and non-standard growth conditions, offering a rigorous basis for robust approximation in learning and related PDE contexts.

Abstract

The universal approximation property uniformly with respect to weakly compact families of measures is established for several classes of neural networks. To that end, we prove that these neural networks are dense in Orlicz spaces, thereby extending classical universal approximation theorems even beyond the traditional $L^p$-setting. The covered classes of neural networks include widely used architectures like feedforward neural networks with non-polynomial activation functions, deep narrow networks with ReLU activation functions and functional input neural networks.

Distributionally robust approximation property of neural networks

TL;DR

The paper addresses distributional uncertainty in neural network function approximation by proving universal approximation theorems in Orlicz spaces and a distributionally robust universal approximation theorem over weakly compact families of finite Borel measures. It shows that networks are dense in the Orlicz heart for architectures including bounded-activation nets, ReLU nets, non-polynomial activations, and functional-input nets, with density measured via the gauge norm . A robust UAT is proved: for a weakly compact set with associated Young pair and target , there exist networks such that under various activation regimes. By linking Orlicz-space universality to distributional robustness, the work extends classical results and provides a theoretical basis for distributionally robust learning and analysis in settings with ambiguity about data distributions. These results broaden the applicability of neural networks under distributional shifts and non-standard growth conditions, offering a rigorous basis for robust approximation in learning and related PDE contexts.

Abstract

The universal approximation property uniformly with respect to weakly compact families of measures is established for several classes of neural networks. To that end, we prove that these neural networks are dense in Orlicz spaces, thereby extending classical universal approximation theorems even beyond the traditional -setting. The covered classes of neural networks include widely used architectures like feedforward neural networks with non-polynomial activation functions, deep narrow networks with ReLU activation functions and functional input neural networks.

Paper Structure

This paper contains 10 sections, 11 theorems, 85 equations.

Key Result

Theorem 2.3

Let $\varphi$ be an $N$-function and $\mu$ be a locally finite Borel measure on $(\mathbb{R}^{N_0},\mathcal{B}(\mathbb{R}^{N_0}))$. Then, the set of neural networks is dense in the Orlicz heart $M^\varphi(\mu)$ with respect to the gauge norm $N_{\varphi,\mu}$ in the following cases:

Theorems & Definitions (27)

  • Example 2.1
  • Remark 2.2
  • Theorem 2.3: Universal Approximation Theorem in Orlicz Spaces
  • Proposition 2.4
  • proof
  • Remark 2.5
  • Proposition 2.6
  • proof
  • Remark 2.7
  • Proposition 2.8
  • ...and 17 more