Table of Contents
Fetching ...

Higher Order Approximation Rates for ReLU CNNs in Korobov Spaces

Yuwen Li, Guozhi Zhang

TL;DR

The paper addresses high-order approximation of Korobov functions in $K_p^{m+1}(\Omega)$ using deep ReLU CNNs, demonstrating that higher-order smoothness yields improved $N^{-m-1}$ convergence (up to a logarithmic factor) with depth $L$ growing only mildly with dimension. It combines sparse-grid interpolation with CNN realizations by representing high-order sparse-grid basis functions through CNN-implementable products, using an approximate multiplication network $\widetilde{\times}_{M,U}$ and a square-approximation via $R_U$. The main result provides an explicit depth bound $L \le C_s d^4 m^3 N \log_2 N$ and an $L_p$-error bound $\|f-f_L\|_{L_p(\Omega)} \le C_{m,d} \|D^{\bm{m}+\bm{1}} f\|_{L_p(\Omega)} N^{-m-1} (\log_2 N)^{(m+2)(d-1)}$, indicating reduced sensitivity to dimensionality compared to Sobolev-based rates. The work suggests that higher-order expressivity of CNNs can be achieved without incurring severe curse-of-dimensionality penalties, motivating further exploration of bit-extraction techniques and related improvements in CNN approximations.

Abstract

This paper investigates the $L_p$ approximation error for higher order Korobov functions using deep convolutional neural networks (CNNs) with ReLU activation. For target functions having a mixed derivative of order m+1 in each direction, we improve classical approximation rate of second order to (m+1)-th order (modulo a logarithmic factor) in terms of the depth of CNNs. The key ingredient in our analysis is approximate representation of high-order sparse grid basis functions by CNNs. The results suggest that higher order expressivity of CNNs does not severely suffer from the curse of dimensionality.

Higher Order Approximation Rates for ReLU CNNs in Korobov Spaces

TL;DR

The paper addresses high-order approximation of Korobov functions in using deep ReLU CNNs, demonstrating that higher-order smoothness yields improved convergence (up to a logarithmic factor) with depth growing only mildly with dimension. It combines sparse-grid interpolation with CNN realizations by representing high-order sparse-grid basis functions through CNN-implementable products, using an approximate multiplication network and a square-approximation via . The main result provides an explicit depth bound and an -error bound , indicating reduced sensitivity to dimensionality compared to Sobolev-based rates. The work suggests that higher-order expressivity of CNNs can be achieved without incurring severe curse-of-dimensionality penalties, motivating further exploration of bit-extraction techniques and related improvements in CNN approximations.

Abstract

This paper investigates the approximation error for higher order Korobov functions using deep convolutional neural networks (CNNs) with ReLU activation. For target functions having a mixed derivative of order m+1 in each direction, we improve classical approximation rate of second order to (m+1)-th order (modulo a logarithmic factor) in terms of the depth of CNNs. The key ingredient in our analysis is approximate representation of high-order sparse grid basis functions by CNNs. The results suggest that higher order expressivity of CNNs does not severely suffer from the curse of dimensionality.
Paper Structure (11 sections, 12 theorems, 92 equations, 5 figures)

This paper contains 11 sections, 12 theorems, 92 equations, 5 figures.

Key Result

Theorem 1.3

Let $\Omega = [0,1]^d$ be the unit cube in $\mathbb R^d$ and let $1\leq p \leq \infty$. Then for sufficiently large $N$ there exists an $L\leq C_s d^4 m^3 N (\log_2 N)$ such that where $\bm{m}+\bm{1}=(m+1,m+1,\ldots,m+1)\in\mathbb{N}_+^d$.

Figures (5)

  • Figure 1: Hierarchical ancestors of $x_{l,i} = 0.1875$ ($d=1$, $l=4$, $i=3$, $\alpha=3$) including its two neighbor points $x_{3,1}=0.125$, $x_{2,1}=0.25$ and another ancestor $x_{1,1}=0.5$.
  • Figure 2: (a) Graphs of $T_1$, $T_2$, $T_3$; (b) interpolants $R_1$, $R_2$ of $x^2$.
  • Figure 3: An illustration of the proof of Lemma \ref{['lemma:elimzeros']}, $l=k=n=2$.
  • Figure 4: (a) $d=1$, $l=4$, $i=3$, the basis function $\phi_{l,i}^2$ at grid point $x_{l,i} = 0.1875$; (b) factors $\rho_{l,i,1}$, $\rho_{l,i,2}$ of $\phi_{l,i}^2$.
  • Figure 5: (a) $d=1$, $l=4$, $i=3$, the basis function $\phi_{l,i}^3$ at grid point $x_{l,i} = 0.1875$; (b) factors $\rho_{l,i,1}$, $\rho_{l,i,2}$, $\rho_{l,i,3}$ of $\phi_{l,i}^3$.

Theorems & Definitions (22)

  • Definition 1.1
  • Definition 1.2: Korobov space
  • Theorem 1.3
  • Lemma 2.1
  • Lemma 2.2
  • Lemma 2.3
  • Lemma 2.4
  • Lemma 2.5
  • Lemma 3.1
  • Lemma 3.2
  • ...and 12 more