Table of Contents
Fetching ...

Approximation properties of neural ODEs

Arturo De Marinis, Davide Murari, Elena Celledoni, Nicola Guglielmi, Brynjulf Owren, Francesco Tudisco

TL;DR

The paper analyzes shallow neural networks whose activation is the time-1 flow map of a neural ODE, proving universal approximation to $\mathcal{C}(\mathbb{R}^m,\mathbb{R}^n)$ under compact convergence. It shows that imposing either a Lipschitz constraint on the flow map or a fixed-norm constraint on the affine layers preserves the universal approximation property, while enforcing both leads to an explicit, quantifiable loss of expressiveness via derived upper and lower bounds on the approximation error. The authors provide numerical experiments suggesting enhanced parameter efficiency for flow-map activations and derive stability bounds tied to the logarithmic norm of the constrained dynamics, along with a stabilization procedure. They illustrate the bounds and properties with two examples (a random LeakyReLU setup and the Two Moons dataset) and discuss a theoretical construction that embeds fixed-norm constraints into the flow map via time-scaling. Overall, the work advances understanding of stability-aware neural ODE-inspired architectures and their capacity to approximate continuous functions while highlighting practical trade-offs in stability versus expressivity.

Abstract

We study the approximation properties of shallow neural networks whose activation function is defined as the flow map of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters satisfy specific constraints. In particular, we constrain the Lipschitz constant of the neural ODE's flow map and the norms of the weights to increase the network's stability. We prove that the UAP holds if we consider either constraint independently. When both are enforced, there is a loss of expressiveness, and we derive approximation bounds that quantify how accurately such a constrained network can approximate a continuous function.

Approximation properties of neural ODEs

TL;DR

The paper analyzes shallow neural networks whose activation is the time-1 flow map of a neural ODE, proving universal approximation to under compact convergence. It shows that imposing either a Lipschitz constraint on the flow map or a fixed-norm constraint on the affine layers preserves the universal approximation property, while enforcing both leads to an explicit, quantifiable loss of expressiveness via derived upper and lower bounds on the approximation error. The authors provide numerical experiments suggesting enhanced parameter efficiency for flow-map activations and derive stability bounds tied to the logarithmic norm of the constrained dynamics, along with a stabilization procedure. They illustrate the bounds and properties with two examples (a random LeakyReLU setup and the Two Moons dataset) and discuss a theoretical construction that embeds fixed-norm constraints into the flow map via time-scaling. Overall, the work advances understanding of stability-aware neural ODE-inspired architectures and their capacity to approximate continuous functions while highlighting practical trade-offs in stability versus expressivity.

Abstract

We study the approximation properties of shallow neural networks whose activation function is defined as the flow map of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters satisfy specific constraints. In particular, we constrain the Lipschitz constant of the neural ODE's flow map and the norms of the weights to increase the network's stability. We prove that the UAP holds if we consider either constraint independently. When both are enforced, there is a loss of expressiveness, and we derive approximation bounds that quantify how accurately such a constrained network can approximate a continuous function.

Paper Structure

This paper contains 15 sections, 10 theorems, 132 equations, 9 figures, 2 tables.

Key Result

Theorem 2.1

The space of functions is a universal approximator for $\mathcal{C}(\mathbb{R}^{m},\mathbb{R}^{n})$ under the compact convergence topology.

Figures (9)

  • Figure 1: Some images from the MNIST dataset.
  • Figure 2: Smoothed Leaky Rectified Linear Unit (LeakyReLU) with minimal slope $\alpha=0.1$, as defined in \ref{['eq:act_fun']}.
  • Figure 3: Comparison of the test mean squared error with varying numbers of training samples $(N)$ and hidden neurons $(d)$.
  • Figure 4: Accuracy of $\bar{\varphi}_\star$ for different values of $\delta$ as a function of the magnitude $\eta$ of the FGSM adversarial attack on the MNIST dataset. For each perturbation magnitude $\eta$, the best validation accuracy is highlighted.
  • Figure 5: Mean and standard deviation of the percentage of points of the discretised domain $K_h$, where the lower bound holds, with respect to the random choice of parameters $A$ and $b$ as a function of $\delta$.
  • ...and 4 more figures

Theorems & Definitions (40)

  • Definition 1.1
  • Remark 1.1
  • Remark 1.2
  • Definition 1.2
  • Definition 1.3
  • Definition 1.4
  • Definition 1.5: see soderlind2006logarithmicsoderlind2024logarithmic
  • Definition 1.5: see soderlind2006logarithmicsoderlind2024logarithmic
  • Remark 1.3
  • Theorem 2.1
  • ...and 30 more