Table of Contents
Fetching ...

Approximation with Random Shallow ReLU Networks with Applications to Model Reference Adaptive Control

Andrew Lamperski, Tyler Lekang

TL;DR

It is shown that for sufficiently smooth functions, ReLU networks with randomly generated weights and biases achieve error of L_{\infty}$ of $O\left(m^{-1 / 2}\right)$ with high probability, where m is the number of neurons.

Abstract

Neural networks are regularly employed in adaptive control of nonlinear systems and related methods of reinforcement learning. A common architecture uses a neural network with a single hidden layer (i.e. a shallow network), in which the weights and biases are fixed in advance and only the output layer is trained. While classical results show that there exist neural networks of this type that can approximate arbitrary continuous functions over bounded regions, they are non-constructive, and the networks used in practice have no approximation guarantees. Thus, the approximation properties required for control with neural networks are assumed, rather than proved. In this paper, we aim to fill this gap by showing that for sufficiently smooth functions, ReLU networks with randomly generated weights and biases achieve $L_{\infty}$ error of $O(m^{-1/2})$ with high probability, where $m$ is the number of neurons. It suffices to generate the weights uniformly over a sphere and the biases uniformly over an interval. We show how the result can be used to get approximations of required accuracy in a model reference adaptive control application.

Approximation with Random Shallow ReLU Networks with Applications to Model Reference Adaptive Control

TL;DR

It is shown that for sufficiently smooth functions, ReLU networks with randomly generated weights and biases achieve error of L_{\infty}O\left(m^{-1 / 2}\right)$ with high probability, where m is the number of neurons.

Abstract

Neural networks are regularly employed in adaptive control of nonlinear systems and related methods of reinforcement learning. A common architecture uses a neural network with a single hidden layer (i.e. a shallow network), in which the weights and biases are fixed in advance and only the output layer is trained. While classical results show that there exist neural networks of this type that can approximate arbitrary continuous functions over bounded regions, they are non-constructive, and the networks used in practice have no approximation guarantees. Thus, the approximation properties required for control with neural networks are assumed, rather than proved. In this paper, we aim to fill this gap by showing that for sufficiently smooth functions, ReLU networks with randomly generated weights and biases achieve error of with high probability, where is the number of neurons. It suffices to generate the weights uniformly over a sphere and the biases uniformly over an interval. We show how the result can be used to get approximations of required accuracy in a model reference adaptive control application.
Paper Structure (12 sections, 7 theorems, 55 equations, 2 figures)

This paper contains 12 sections, 7 theorems, 55 equations, 2 figures.

Key Result

Theorem 1

Let $P$ be a probability density function over $\mathbb{S}^{n-1}\times [-R,R]$ with $\inf_{(\alpha,t)\in \mathbb{S}^{n-1}\times [-R,R]}P(\alpha,t)=P_{\min}>0$. Let $({\bm{\alpha}}_1,\bm{t}_1),\ldots,({\bm{\alpha}}_m,\bm{t}_m)$ be independent, identically distributed samples from $P$. If $f$ satisfie such that for all $\nu \in (0,1)$, with probability at least $1-\nu$, the neural network approximat

Figures (2)

  • Figure 1: Convergence of the Randomized Approximation. The thick blue line shows the original function, $f$, while the thin lines show the neural network approximation $\bm{f}_N$ for various numbers of neurons, $m$. When $m=0$, we have $\bm{f}_N(x)=a^\top x +b$. The approximation scheme quickly approximates the general shape, which gets refined with increasing neurons. However, it should be noted that the coefficients, $a$, $b$, and $\bm{c}_i$, are generated via the theoretical construction from the text. Better fits would be obtained if they were optimized. See Fig. \ref{['fig:optimized']}.
  • Figure 2: Optimized Fit. Using the same weights and biases $({\bm{\alpha}}_i,\bm{t}_i)$ used to generate Fig. \ref{['fig:importance']}, we optimized the coefficients $a$, $b$, and $\bm{c}_i$ via least-squares. With $100$ neurons, the optimized fit is nearly exact.

Theorems & Definitions (13)

  • Theorem 1
  • Corollary 1
  • Remark 1
  • Remark 2
  • Lemma 1
  • proof
  • Corollary 2
  • proof
  • Lemma 2
  • proof
  • ...and 3 more