Approximation properties of neural ODEs
Arturo De Marinis, Davide Murari, Elena Celledoni, Nicola Guglielmi, Brynjulf Owren, Francesco Tudisco
TL;DR
The paper analyzes shallow neural networks whose activation is the time-1 flow map of a neural ODE, proving universal approximation to $\mathcal{C}(\mathbb{R}^m,\mathbb{R}^n)$ under compact convergence. It shows that imposing either a Lipschitz constraint on the flow map or a fixed-norm constraint on the affine layers preserves the universal approximation property, while enforcing both leads to an explicit, quantifiable loss of expressiveness via derived upper and lower bounds on the approximation error. The authors provide numerical experiments suggesting enhanced parameter efficiency for flow-map activations and derive stability bounds tied to the logarithmic norm of the constrained dynamics, along with a stabilization procedure. They illustrate the bounds and properties with two examples (a random LeakyReLU setup and the Two Moons dataset) and discuss a theoretical construction that embeds fixed-norm constraints into the flow map via time-scaling. Overall, the work advances understanding of stability-aware neural ODE-inspired architectures and their capacity to approximate continuous functions while highlighting practical trade-offs in stability versus expressivity.
Abstract
We study the approximation properties of shallow neural networks whose activation function is defined as the flow map of a neural ordinary differential equation (neural ODE) at the final time of the integration interval. We prove the universal approximation property (UAP) of such shallow neural networks in the space of continuous functions. Furthermore, we investigate the approximation properties of shallow neural networks whose parameters satisfy specific constraints. In particular, we constrain the Lipschitz constant of the neural ODE's flow map and the norms of the weights to increase the network's stability. We prove that the UAP holds if we consider either constraint independently. When both are enforced, there is a loss of expressiveness, and we derive approximation bounds that quantify how accurately such a constrained network can approximate a continuous function.
