Higher Order Approximation Rates for ReLU CNNs in Korobov Spaces

Yuwen Li; Guozhi Zhang

Higher Order Approximation Rates for ReLU CNNs in Korobov Spaces

Yuwen Li, Guozhi Zhang

TL;DR

The paper addresses high-order approximation of Korobov functions in $K_p^{m+1}(\Omega)$ using deep ReLU CNNs, demonstrating that higher-order smoothness yields improved $N^{-m-1}$ convergence (up to a logarithmic factor) with depth $L$ growing only mildly with dimension. It combines sparse-grid interpolation with CNN realizations by representing high-order sparse-grid basis functions through CNN-implementable products, using an approximate multiplication network $\widetilde{\times}_{M,U}$ and a square-approximation via $R_U$. The main result provides an explicit depth bound $L \le C_s d^4 m^3 N \log_2 N$ and an $L_p$-error bound $\|f-f_L\|_{L_p(\Omega)} \le C_{m,d} \|D^{\bm{m}+\bm{1}} f\|_{L_p(\Omega)} N^{-m-1} (\log_2 N)^{(m+2)(d-1)}$, indicating reduced sensitivity to dimensionality compared to Sobolev-based rates. The work suggests that higher-order expressivity of CNNs can be achieved without incurring severe curse-of-dimensionality penalties, motivating further exploration of bit-extraction techniques and related improvements in CNN approximations.

Abstract

This paper investigates the $L_p$ approximation error for higher order Korobov functions using deep convolutional neural networks (CNNs) with ReLU activation. For target functions having a mixed derivative of order m+1 in each direction, we improve classical approximation rate of second order to (m+1)-th order (modulo a logarithmic factor) in terms of the depth of CNNs. The key ingredient in our analysis is approximate representation of high-order sparse grid basis functions by CNNs. The results suggest that higher order expressivity of CNNs does not severely suffer from the curse of dimensionality.

Higher Order Approximation Rates for ReLU CNNs in Korobov Spaces

TL;DR

The paper addresses high-order approximation of Korobov functions in

using deep ReLU CNNs, demonstrating that higher-order smoothness yields improved

convergence (up to a logarithmic factor) with depth

growing only mildly with dimension. It combines sparse-grid interpolation with CNN realizations by representing high-order sparse-grid basis functions through CNN-implementable products, using an approximate multiplication network

and a square-approximation via

. The main result provides an explicit depth bound

and an

-error bound

, indicating reduced sensitivity to dimensionality compared to Sobolev-based rates. The work suggests that higher-order expressivity of CNNs can be achieved without incurring severe curse-of-dimensionality penalties, motivating further exploration of bit-extraction techniques and related improvements in CNN approximations.

Abstract

This paper investigates the

approximation error for higher order Korobov functions using deep convolutional neural networks (CNNs) with ReLU activation. For target functions having a mixed derivative of order m+1 in each direction, we improve classical approximation rate of second order to (m+1)-th order (modulo a logarithmic factor) in terms of the depth of CNNs. The key ingredient in our analysis is approximate representation of high-order sparse grid basis functions by CNNs. The results suggest that higher order expressivity of CNNs does not severely suffer from the curse of dimensionality.

Paper Structure (11 sections, 12 theorems, 92 equations, 5 figures)

This paper contains 11 sections, 12 theorems, 92 equations, 5 figures.

Introduction
Preliminaries and notation
Interpolation on Sparse Grids
Representing Shallow Networks by Deep CNNs
Approximating Polynomials by CNNs
Analysis of Approximation Error Bounds of CNNs
Concluding Remarks
Proof of Lemma \ref{['Le:coeffient_p_bound']}.
Proof of Lemma \ref{['lemma:interpolationerror']}.
Proof of Lemma \ref{['lemma:approxtimes']}.
Proof of Lemma \ref{['lemma:squareapproximation']}

Key Result

Theorem 1.3

Let $\Omega = [0,1]^d$ be the unit cube in $\mathbb R^d$ and let $1\leq p \leq \infty$. Then for sufficiently large $N$ there exists an $L\leq C_s d^4 m^3 N (\log_2 N)$ such that where $\bm{m}+\bm{1}=(m+1,m+1,\ldots,m+1)\in\mathbb{N}_+^d$.

Figures (5)

Figure 1: Hierarchical ancestors of $x_{l,i} = 0.1875$ ($d=1$, $l=4$, $i=3$, $\alpha=3$) including its two neighbor points $x_{3,1}=0.125$, $x_{2,1}=0.25$ and another ancestor $x_{1,1}=0.5$.
Figure 2: (a) Graphs of $T_1$, $T_2$, $T_3$; (b) interpolants $R_1$, $R_2$ of $x^2$.
Figure 3: An illustration of the proof of Lemma \ref{['lemma:elimzeros']}, $l=k=n=2$.
Figure 4: (a) $d=1$, $l=4$, $i=3$, the basis function $\phi_{l,i}^2$ at grid point $x_{l,i} = 0.1875$; (b) factors $\rho_{l,i,1}$, $\rho_{l,i,2}$ of $\phi_{l,i}^2$.
Figure 5: (a) $d=1$, $l=4$, $i=3$, the basis function $\phi_{l,i}^3$ at grid point $x_{l,i} = 0.1875$; (b) factors $\rho_{l,i,1}$, $\rho_{l,i,2}$, $\rho_{l,i,3}$ of $\phi_{l,i}^3$.

Theorems & Definitions (22)

Definition 1.1
Definition 1.2: Korobov space
Theorem 1.3
Lemma 2.1
Lemma 2.2
Lemma 2.3
Lemma 2.4
Lemma 2.5
Lemma 3.1
Lemma 3.2
...and 12 more

Higher Order Approximation Rates for ReLU CNNs in Korobov Spaces

TL;DR

Abstract

Higher Order Approximation Rates for ReLU CNNs in Korobov Spaces

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (22)