Table of Contents
Fetching ...

Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations

Boris Hanin

TL;DR

The paper investigates how depth and bounded width interact in ReLU neural nets to achieve universal function approximation on [0,1]^d. It provides tight width thresholds, showing w_min(d) ≤ d+2 for general functions and w_min^{conv}(d) ≤ d+1 for convex functions, and delivers explicit depth-constructive results: convex Lipschitz functions can be approximated with width d+1 and depth ~k+1, general continuous functions with width d+3 and depth tied to the modulus of continuity, and exact representations of piecewise affine functions via width d+3 (width d+1 for convex) with depth proportional to the number of affine pieces. The core techniques combine max-affine representations, convex decomposition, and constructive network architectures to translate piecewise affine structure into ReLU nets, providing quantitative depth bounds. Overall, the work clarifies the depth advantages in bounded-width regimes and offers explicit, implementable architectures with provable approximation guarantees.

Abstract

This article concerns the expressive power of depth in neural nets with ReLU activations and bounded width. We are particularly interested in the following questions: what is the minimal width $w_{\text{min}}(d)$ so that ReLU nets of width $w_{\text{min}}(d)$ (and arbitrary depth) can approximate any continuous function on the unit cube $[0,1]^d$ aribitrarily well? For ReLU nets near this minimal width, what can one say about the depth necessary to approximate a given function? Our approach to this paper is based on the observation that, due to the convexity of the ReLU activation, ReLU nets are particularly well-suited for representing convex functions. In particular, we prove that ReLU nets with width $d+1$ can approximate any continuous convex function of $d$ variables arbitrarily well. These results then give quantitative depth estimates for the rate of approximation of any continuous scalar function on the $d$-dimensional cube $[0,1]^d$ by ReLU nets with width $d+3.$

Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations

TL;DR

The paper investigates how depth and bounded width interact in ReLU neural nets to achieve universal function approximation on [0,1]^d. It provides tight width thresholds, showing w_min(d) ≤ d+2 for general functions and w_min^{conv}(d) ≤ d+1 for convex functions, and delivers explicit depth-constructive results: convex Lipschitz functions can be approximated with width d+1 and depth ~k+1, general continuous functions with width d+3 and depth tied to the modulus of continuity, and exact representations of piecewise affine functions via width d+3 (width d+1 for convex) with depth proportional to the number of affine pieces. The core techniques combine max-affine representations, convex decomposition, and constructive network architectures to translate piecewise affine structure into ReLU nets, providing quantitative depth bounds. Overall, the work clarifies the depth advantages in bounded-width regimes and offers explicit, implementable architectures with provable approximation guarantees.

Abstract

This article concerns the expressive power of depth in neural nets with ReLU activations and bounded width. We are particularly interested in the following questions: what is the minimal width so that ReLU nets of width (and arbitrary depth) can approximate any continuous function on the unit cube aribitrarily well? For ReLU nets near this minimal width, what can one say about the depth necessary to approximate a given function? Our approach to this paper is based on the observation that, due to the convexity of the ReLU activation, ReLU nets are particularly well-suited for representing convex functions. In particular, we prove that ReLU nets with width can approximate any continuous convex function of variables arbitrarily well. These results then give quantitative depth estimates for the rate of approximation of any continuous scalar function on the -dimensional cube by ReLU nets with width

Paper Structure

This paper contains 6 sections, 6 theorems, 33 equations.

Key Result

Theorem 1

Let $d\geq 1$ and $f:[0,1]^d\rightarrow {\mathbb R}_+$ be a positive function with $\left\lVert f\right\rVert_{C^0}=1$. We have the following three cases:

Theorems & Definitions (9)

  • Theorem 1
  • Theorem 2
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Proposition 5
  • Lemma 6
  • proof