Table of Contents
Fetching ...

Random Projection Neural Networks of Best Approximation: Convergence theory and practical applications

Gianluca Fabiani

TL;DR

The paper investigates Random Projection Neural Networks (RPNNs) with fixed internal weights and biases, focusing on best $L^p$ approximation and convergence properties. It proves the existence and uniqueness of the best $L^p$ RPNN approximation and establishes an exponential convergence rate for infinitely differentiable targets when the activation is non-polynomial and infinitely differentiable, linking the rate to polynomial approximants and Bernstein ellipse concepts. It introduces three internal-parameter selection strategies (naive, parsimonious, function-informed) and demonstrates, across five benchmark problems, that non-naive RPNNs can match Legendre polynomial accuracy while offering computational advantages over fully trained neural networks. The work also discusses numerical stability issues and compares COD and SVD methods for solving the induced linear systems, highlighting practical considerations for ill-conditioned training and potential limitations near singularities and in higher dimensions.

Abstract

We investigate the concept of Best Approximation for Feedforward Neural Networks (FNN) and explore their convergence properties through the lens of Random Projection (RPNNs). RPNNs have predetermined and fixed, once and for all, internal weights and biases, offering computational efficiency. We demonstrate that there exists a choice of external weights, for any family of such RPNNs, with non-polynomial infinitely differentiable activation functions, that exhibit an exponential convergence rate when approximating any infinitely differentiable function. For illustration purposes, we test the proposed RPNN-based function approximation, with parsimoniously chosen basis functions, across five benchmark function approximation problems. Results show that RPNNs achieve comparable performance to established methods such as Legendre Polynomials, highlighting their potential for efficient and accurate function approximation.

Random Projection Neural Networks of Best Approximation: Convergence theory and practical applications

TL;DR

The paper investigates Random Projection Neural Networks (RPNNs) with fixed internal weights and biases, focusing on best approximation and convergence properties. It proves the existence and uniqueness of the best RPNN approximation and establishes an exponential convergence rate for infinitely differentiable targets when the activation is non-polynomial and infinitely differentiable, linking the rate to polynomial approximants and Bernstein ellipse concepts. It introduces three internal-parameter selection strategies (naive, parsimonious, function-informed) and demonstrates, across five benchmark problems, that non-naive RPNNs can match Legendre polynomial accuracy while offering computational advantages over fully trained neural networks. The work also discusses numerical stability issues and compares COD and SVD methods for solving the induced linear systems, highlighting practical considerations for ill-conditioned training and potential limitations near singularities and in higher dimensions.

Abstract

We investigate the concept of Best Approximation for Feedforward Neural Networks (FNN) and explore their convergence properties through the lens of Random Projection (RPNNs). RPNNs have predetermined and fixed, once and for all, internal weights and biases, offering computational efficiency. We demonstrate that there exists a choice of external weights, for any family of such RPNNs, with non-polynomial infinitely differentiable activation functions, that exhibit an exponential convergence rate when approximating any infinitely differentiable function. For illustration purposes, we test the proposed RPNN-based function approximation, with parsimoniously chosen basis functions, across five benchmark function approximation problems. Results show that RPNNs achieve comparable performance to established methods such as Legendre Polynomials, highlighting their potential for efficient and accurate function approximation.
Paper Structure (23 sections, 8 theorems, 54 equations, 5 figures)

This paper contains 23 sections, 8 theorems, 54 equations, 5 figures.

Key Result

Theorem 2.1

Given an RPNN with $N$ hidden neurons as in Eq. eq:RPNN (or equivalently a deep RPNN as in Eq. eq:deepRPNN) with an infinitely differentiable, non-polynomial and slowly-increasing activation function $\psi:\mathbb{R}\rightarrow\mathbb{R}$, such that the support of its Fourier transform is an open su

Figures (5)

  • Figure 1: First benchmark example, function $f_1$ in Eq. \ref{['eq:example1']}, with $k=10$ and $k=100$, presenting a high steep gradient. (a) reference functions; (b)-(c) Convergence diagrams of $L^2$-norms, for $k=10$ and $k=100$, respectively. We compare Legendre Polynomial, RPNNs of best $L^2$ approximation, standard FNN trained with Levenberg-Marquardt algorithm and Cubic Spline. For the RPNNs we compare 3 different selection (naive, function-agnostic and function-informed) of the internal parameters as explained in Section \ref{['sec:selection']}. For the RPNNs we report the mean accuracy out of 100 different Monte-Carlo selections of the internal parameters. For the FNN we report the best network out of 10 runs with different initialization of the weights.
  • Figure 2: Second benchmark example, function $f_2$ in Eq. \ref{['eq:example2']}, with $k=1$ and $k=10$, presenting high-oscillations. (a) reference functions; (b)-(c) Convergence diagrams in $L^2$-norm, for $k=1$ and $k=100$, respectively. We compare Legendre Polynomials, RPNNs of best $L^2$ approximation, standard FNN and Cubic Spline. For the RPNNs we compare 3 different selection (naive, function-agnostic and function-informed) of the internal parameters as explained in Section \ref{['sec:selection']}. For the RPNNs we report the mean accuracy out of 100 different Monte-Carlo selections of the internal parameters. For the FNN we report the best network out of 10 runs with different initialization of the weights.
  • Figure 3: Fourth benchmark example, approximating the analytical solution $f_3$ in Eq. \ref{['eq:burgers_sol']} of the Burgers' PDE in Eq. \ref{['eq:burgersPDE']} with viscosity $\nu=\frac{0.01}{\pi}$. (a) reference functions at time $t=1/\pi$ and $t=2/\pi$; (b)-(c) Convergence diagrams of $L^2$-norm, comparing the Legendre Polynomial, RPNNs of best $L^2$ approximation, standard FNN trained with Levenberg-Marquardt algorithm and Cubic Spline. For the RPNNs we compare 3 different selection (naive, function-agnostic and function-informed) of the internal parameters as explained in Section \ref{['sec:selection']}. For the RPNNs we report the mean accuracy out of 100 different Monte-Carlo selections of the internal parameters. For the FNN we report the best network out of 10 runs with different initialization of the weights.
  • Figure 4: Benchmark examples, function $f_4$ in Eq. \ref{['eq:example5a']} and $f_5$ in Eq. \ref{['eq:example5b']} approaching discontinuity. (a)-(c) reference functions; (b)-(d) Convergence diagrams $L^2$-norm comparing Legendre Polynomial, RPNNs of best $L^2$ approximation, standard FNN and Cubic Spline. For the RPNNs we compare 3 different selection (naive, function-agnostic and function-informed) of the internal parameters as explained in Section \ref{['sec:selection']}. For the RPNNs we report the mean accuracy out of 100 different Monte-Carlo selections of the internal parameters. For the FNN we report the best network out of 10 runs with different initialization of the weights.
  • Figure 5: Comparison of Singular Value Decomposition (SVD)-based and Complete Orthogonal Decomposition (COD)-based solution of the problem \ref{['eq:minimum_norm']}. We fix the number of neurons to $N=400$. Convergence with respect to the tolerance $\epsilon$ for (a) $f_1$ in Eq.\ref{['eq:example1']} with $k=10$, (b) $f_1$ in Eq.\ref{['eq:example1']} with $k=100$, (c) $f_2$ in Eq.\ref{['eq:example2']} with $k=1$, (d) $f_2$ in Eq.\ref{['eq:example2']} with $k=10$, respectively.

Theorems & Definitions (12)

  • Theorem 2.1: Exact interpolation of RPNNs
  • Theorem 3.1
  • proof
  • Lemma 3.2
  • Definition 3.1
  • Theorem 3.3
  • Theorem 4.1
  • Lemma 4.2
  • proof
  • Theorem 4.3
  • ...and 2 more