Expressivity and Approximation Properties of Deep Neural Networks with ReLU$^k$ Activation

Juncai He; Tong Mao; Jinchao Xu

Expressivity and Approximation Properties of Deep Neural Networks with ReLU$^k$ Activation

Juncai He, Tong Mao, Jinchao Xu

TL;DR

The paper addresses expressivity and approximation for deep neural networks with ReLU$^k$ activations, introducing a constructive framework to represent polynomials of degree up to $k^L$ via deep architectures and deriving explicit bounds on network width, depth, and parameters. A key technical tool is a lemma that expresses monomials as linear combinations of one-dimensional power terms with controlled coefficients, enabling exact polynomial representations and enabling suboptimal rates for analytic and Sobolev function approximation. Building on this, the authors establish adaptive approximation properties for functions in variation spaces $\\mathcal{K}_1(\\mathbb P_K^d)$, showing that deep ReLU$^k$ networks can emulate shallower networks across levels $K=k,k^2,...,k^L$, and achieve near-shallow rates with depth enabling adaptation to unknown regularity. The results underscore the hierarchical benefits of depth, provide concrete parameter bounds, and extend the theory of ReLU$^k$ networks to encompass polynomial representation, analytic/Sobolev approximation, and adaptive approximation in variation spaces.

Abstract

In this paper, we investigate the expressivity and approximation properties of deep neural networks employing the ReLU$^k$ activation function for $k \geq 2$. Although deep ReLU networks can approximate polynomials effectively, deep ReLU$^k$ networks have the capability to represent higher-degree polynomials precisely. Our initial contribution is a comprehensive, constructive proof for polynomial representation using deep ReLU$^k$ networks. This allows us to establish an upper bound on both the size and count of network parameters. Consequently, we are able to demonstrate a suboptimal approximation rate for functions from Sobolev spaces as well as for analytic functions. Additionally, through an exploration of the representation power of deep ReLU$^k$ networks for shallow networks, we reveal that deep ReLU$^k$ networks can approximate functions from a range of variation spaces, extending beyond those generated solely by the ReLU$^k$ activation function. This finding demonstrates the adaptability of deep ReLU$^k$ networks in approximating functions within various variation spaces.

Expressivity and Approximation Properties of Deep Neural Networks with ReLU$^k$ Activation

TL;DR

The paper addresses expressivity and approximation for deep neural networks with ReLU

activations, introducing a constructive framework to represent polynomials of degree up to

via deep architectures and deriving explicit bounds on network width, depth, and parameters. A key technical tool is a lemma that expresses monomials as linear combinations of one-dimensional power terms with controlled coefficients, enabling exact polynomial representations and enabling suboptimal rates for analytic and Sobolev function approximation. Building on this, the authors establish adaptive approximation properties for functions in variation spaces

, showing that deep ReLU

networks can emulate shallower networks across levels

, and achieve near-shallow rates with depth enabling adaptation to unknown regularity. The results underscore the hierarchical benefits of depth, provide concrete parameter bounds, and extend the theory of ReLU

networks to encompass polynomial representation, analytic/Sobolev approximation, and adaptive approximation in variation spaces.

Abstract

In this paper, we investigate the expressivity and approximation properties of deep neural networks employing the ReLU

activation function for

. Although deep ReLU networks can approximate polynomials effectively, deep ReLU

networks have the capability to represent higher-degree polynomials precisely. Our initial contribution is a comprehensive, constructive proof for polynomial representation using deep ReLU

networks. This allows us to establish an upper bound on both the size and count of network parameters. Consequently, we are able to demonstrate a suboptimal approximation rate for functions from Sobolev spaces as well as for analytic functions. Additionally, through an exploration of the representation power of deep ReLU

networks for shallow networks, we reveal that deep ReLU

networks can approximate functions from a range of variation spaces, extending beyond those generated solely by the ReLU

activation function. This finding demonstrates the adaptability of deep ReLU

networks in approximating functions within various variation spaces.

Paper Structure (4 sections, 7 theorems, 69 equations)

This paper contains 4 sections, 7 theorems, 69 equations.

Introduction
Representing Polynomials
Adaptive Approximation for Functions from Variation Spaces
Conclusions

Key Result

Lemma 1

Let $n,d\in\mathbb{N}^+$, and $\alpha=(\alpha_1,\dots,\alpha_d)\in\mathbb{N}^d$ with then the monomial $x^\alpha$ can be written as a linear combination with each $c_{n_2,\dots,n_d}\in\left[-\left(\frac{n}{2}+1\right)^{2d},\left(\frac{n}{2}+1\right)^{2d}\right]$.

Theorems & Definitions (20)

Definition 1
Definition 2: ReLU$^k$ networks
Lemma 1
proof
Remark 1
Lemma 2
proof
Theorem 1
proof
Remark 2
...and 10 more

Expressivity and Approximation Properties of Deep Neural Networks with ReLU$^k$ Activation

TL;DR

Abstract

Expressivity and Approximation Properties of Deep Neural Networks with ReLU$^k$ Activation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (20)