Table of Contents
Fetching ...

On Vector Field Reconstruction from Noisy ODE in High Ambient Dimension

Hugo Henneuse

TL;DR

This work studies nonparametric reconstruction of the vector field $f$ in autonomous ODEs $y' = f(y)$ in high ambient dimension $D$ when the initial conditions lie on a low-dimensional structure. It proposes a regression-like estimator that alternates flow estimation, derivative estimation, and nearest-neighbor averaging to recover $f$ on the envelope $\operatorname{Env}_f(\mathcal X, T)$ from noisy observations. The authors derive minimax convergence rates that depend on the temporal grid, the number of trajectories, and the mass-concentration parameter $b$ of the initial-value distribution, showing these rates do not depend on $D$ and are optimal up to logarithmic factors; they also provide geometric corollaries under a manifold assumption and validate performance via numerical experiments in both low- and high-dimensional settings. The results offer a simple, scalable baseline for vector-field estimation in high dimensions and connect to manifold learning to mitigate the curse of dimensionality.

Abstract

This work investigates the nonparametric estimation of the vector field of a noisy Ordinary Differential Equation (ODE) in high-dimensional ambient spaces, under the assumption that the initial conditions are sampled from a lower-dimensional structure. Specifically, let \( f:\mathbb{R}^{D}\to\mathbb{R}^{D} \) denote the vector field of the autonomous ODE \( y' = f(y) \). We observe noisy trajectories \( \tilde{y}_{X_i}(t_j) = y_{X_i}(t_j) + \varepsilon_{i,j} \), where \( y_{X_i}(t_j) \) is the solution at time \( t_j \) with initial condition \( y(0)=X_i \), the \( X_i \) are drawn from a \((a,b)\)-standard distribution \( μ\), and \( \varepsilon_{i,j} \) denotes noise. From a minimax perspective, we study the reconstruction of \( f \) within the envelope of trajectories generated by the support of \( μ\). We proposed an estimator combining flow reconstruction with derivative estimation techniques from nonparametric regression. Under mild regularity assumptions on \( f \), we establish convergence rates that depend on the temporal resolution, the number of initial conditions, and the parameter \( b \), which controls the mass concentration of \( μ\). These rates are then shown to be minimax optimal (up to logarithmic factors) and illustrate how the proposed approach mitigates the curse of dimensionality. Additionally, we illustrate the computational and statistical efficiency of our estimator through numerical experiments.

On Vector Field Reconstruction from Noisy ODE in High Ambient Dimension

TL;DR

This work studies nonparametric reconstruction of the vector field in autonomous ODEs in high ambient dimension when the initial conditions lie on a low-dimensional structure. It proposes a regression-like estimator that alternates flow estimation, derivative estimation, and nearest-neighbor averaging to recover on the envelope from noisy observations. The authors derive minimax convergence rates that depend on the temporal grid, the number of trajectories, and the mass-concentration parameter of the initial-value distribution, showing these rates do not depend on and are optimal up to logarithmic factors; they also provide geometric corollaries under a manifold assumption and validate performance via numerical experiments in both low- and high-dimensional settings. The results offer a simple, scalable baseline for vector-field estimation in high dimensions and connect to manifold learning to mitigate the curse of dimensionality.

Abstract

This work investigates the nonparametric estimation of the vector field of a noisy Ordinary Differential Equation (ODE) in high-dimensional ambient spaces, under the assumption that the initial conditions are sampled from a lower-dimensional structure. Specifically, let denote the vector field of the autonomous ODE \( y' = f(y) \). We observe noisy trajectories \( \tilde{y}_{X_i}(t_j) = y_{X_i}(t_j) + \varepsilon_{i,j} \), where \( y_{X_i}(t_j) \) is the solution at time with initial condition \( y(0)=X_i \), the are drawn from a \((a,b)\)-standard distribution , and denotes noise. From a minimax perspective, we study the reconstruction of within the envelope of trajectories generated by the support of . We proposed an estimator combining flow reconstruction with derivative estimation techniques from nonparametric regression. Under mild regularity assumptions on , we establish convergence rates that depend on the temporal resolution, the number of initial conditions, and the parameter , which controls the mass concentration of . These rates are then shown to be minimax optimal (up to logarithmic factors) and illustrate how the proposed approach mitigates the curse of dimensionality. Additionally, we illustrate the computational and statistical efficiency of our estimator through numerical experiments.

Paper Structure

This paper contains 11 sections, 7 theorems, 78 equations, 3 figures.

Key Result

Theorem 1

For the choices of calibration parameters : and $n$ and $m$ sufficiently large, we have: where $C_{1}$ is a constant depending only on $L$, $M$, $a$, $b$, $D$, $T$, and $\sigma$.

Figures (3)

  • Figure 1: Noisy trajectories along with the estimated and true vector fields for the Van der Pol oscillator and the Lotka--Volterra model. Colored vectors (red and green) indicate positions lying within the respective solution envelopes, $\operatorname{ENV}(\mathcal{X}_1, T_1)$ for the Van der Pol system and $\operatorname{ENV}(\mathcal{X}_2, T_2)$ for the Lotka--Volterra system. For both simulations, we choose $k_1=k=r=10$ and $k_2=7$.
  • Figure 2: Mean error as a function of $mn$ in scenario (a) (log--log scale, with the dotted line indicating the linear trend), as a function of $m$ in scenario (b) (linear scale), and as a function of $n$ in scenario (c) (linear scale). Parameters $k_1$, $k_2$, $r$ and $k$ are chosen according to Theorem \ref{['thm: estim-lip-ab']}.
  • Figure 3: Mean error and computation time versus ambient dimensions $D=2, 6, 12, 18$ for SINDy estimators (degrees 1, 2, and 3) and for our method (with and without splitting). Our parameters are chosen according to Theorem \ref{['thm: estim-lip-ab']}. The thresholds for Sindy estimators are chosen, for each simulation, by minimizing the empirical error within the grid $[0.02, 0.6, 0.1, 0.14, 0.18, 0.22, 0.26, 0.3]$.

Theorems & Definitions (7)

  • Theorem 1
  • Theorem 2
  • Corollary 1
  • Corollary 2
  • Lemma 1
  • Lemma 2
  • Theorem 3