Table of Contents
Fetching ...

ChebNet: Efficient and Stable Constructions of Deep Neural Networks with Rectified Power Units via Chebyshev Approximations

Shanshan Tang, Bo Li, Haijun Yu

TL;DR

The paper addresses instability and inefficiency in RePU networks that rely on power-series representations. It introduces ChebNets, constructed from Chebyshev polynomial approximations, to achieve stable, spectrally accurate function representations with optimal network sizes. The authors develop univariate and multivariate ChebNets, including sparse downward-closed spaces, and demonstrate through theory that ChebNets maintain stability and comparable or better approximation power than PowerNets. Numerical experiments show ChebNets train more robustly and achieve substantial accuracy gains after fine-tuning, highlighting their practical potential for high-accuracy smooth-function approximation in real applications.

Abstract

In a previous study [B. Li, S. Tang and H. Yu, Commun. Comput. Phy. 27(2):379-411, 2020], it is shown that deep neural networks built with rectified power units (RePU) as activation functions can give better approximation for sufficient smooth functions than those built with rectified linear units, by converting polynomial approximations using power series into deep neural networks with optimal complexity and no approximation error. However, in practice, power series approximations are not easy to obtain due to the associated stability issue. In this paper, we propose a new and more stable way to construct RePU deep neural networks based on Chebyshev polynomial approximations. By using a hierarchical structure of Chebyshev polynomial approximation in frequency domain, we obtain efficient and stable deep neural network construction, which we call ChebNet. The approximation of smooth functions by ChebNets is no worse than the approximation by deep RePU nets using power series. On the same time, ChebNets are much more stable. Numerical results show that the constructed ChebNets can be further fine-tuned to obtain much better results than those obtained by tuning deep RePU nets constructed by power series approach. As spectral accuracy is hard to obtain by direct training of deep neural networks, ChebNets provide a practical way to obtain spectral accuracy, it is expected to be useful in real applications that require efficient approximations of smooth functions.

ChebNet: Efficient and Stable Constructions of Deep Neural Networks with Rectified Power Units via Chebyshev Approximations

TL;DR

The paper addresses instability and inefficiency in RePU networks that rely on power-series representations. It introduces ChebNets, constructed from Chebyshev polynomial approximations, to achieve stable, spectrally accurate function representations with optimal network sizes. The authors develop univariate and multivariate ChebNets, including sparse downward-closed spaces, and demonstrate through theory that ChebNets maintain stability and comparable or better approximation power than PowerNets. Numerical experiments show ChebNets train more robustly and achieve substantial accuracy gains after fine-tuning, highlighting their practical potential for high-accuracy smooth-function approximation in real applications.

Abstract

In a previous study [B. Li, S. Tang and H. Yu, Commun. Comput. Phy. 27(2):379-411, 2020], it is shown that deep neural networks built with rectified power units (RePU) as activation functions can give better approximation for sufficient smooth functions than those built with rectified linear units, by converting polynomial approximations using power series into deep neural networks with optimal complexity and no approximation error. However, in practice, power series approximations are not easy to obtain due to the associated stability issue. In this paper, we propose a new and more stable way to construct RePU deep neural networks based on Chebyshev polynomial approximations. By using a hierarchical structure of Chebyshev polynomial approximation in frequency domain, we obtain efficient and stable deep neural network construction, which we call ChebNet. The approximation of smooth functions by ChebNets is no worse than the approximation by deep RePU nets using power series. On the same time, ChebNets are much more stable. Numerical results show that the constructed ChebNets can be further fine-tuned to obtain much better results than those obtained by tuning deep RePU nets constructed by power series approach. As spectral accuracy is hard to obtain by direct training of deep neural networks, ChebNets provide a practical way to obtain spectral accuracy, it is expected to be useful in real applications that require efficient approximations of smooth functions.

Paper Structure

This paper contains 12 sections, 12 theorems, 35 equations, 6 figures, 3 tables.

Key Result

Theorem 1

For any $u\!\in\! {B^{m}_{\bm{\alpha},\bm{\beta}}}(I^d)$, with $\vert u \vert_{B^m_{\bm{\alpha},\bm{\beta}}(I^d)} \le 1$, $\bm{\alpha,\beta}\!\in\!(-1,\infty)^{d}$, and any $\varepsilon\in(0,1)$ there exists a $\sigma_2$ neural network $\Phi_\varepsilon^u$ having $\mathcal{O}\left(\frac{d}{m} \log_2

Figures (6)

  • Figure 1: Training results of PowerNet (left) and ChebNet (right) constructed to approximate Gauss function with ${n}=15$. $L^2$ norm is used for test error.
  • Figure 2: Training results of PowerNet (left) and ChebNet (right) constructed to approximate function $f_2$ defined in \ref{['eq:Cauchy_fun']} with ${n}=11$.
  • Figure 3: The coefficients of Legendre expansion: $a_j, j=0,\ldots,{n}$ (Left) and power series expansion: $\tilde{a}_j, j=0,\ldots,{n}$ (Right) for Gauss function with ${n} =15$.
  • Figure 4: The coefficients of Chebyshev expansion: $c_j, j=0,\ldots,{n}$ (Left) and coefficients of hierarchical Chebyshev expansion: $\tilde{c}_j, j=0,\ldots,{n}$ (Right) for Gauss function with ${n} =15$.
  • Figure 5: The coefficients of Legendre expansion: $a_j, j=0,\ldots,{n}$ (Left) and coefficients of power series expansion: $\tilde{a}_j, j=0,\ldots,{n}$ (Right) for function $f_2$ defined in \ref{['eq:Cauchy_fun']} with ${n} =30$.
  • ...and 1 more figures

Theorems & Definitions (28)

  • Theorem 1
  • Lemma 1
  • Lemma 2: Lemma 1 in li_better_2019
  • Remark 1
  • Theorem 2
  • proof
  • Remark 2
  • Theorem 3
  • proof
  • Theorem 4
  • ...and 18 more