Table of Contents
Fetching ...

A joint optimization approach to identifying sparse dynamics using least squares kernel collocation

Alexander W. Hsu, Ike W. Griss Salas, Jacob M. Stevens-Haas, J. Nathan Kutz, Aleksandr Aravkin, Bamdad Hosseini

TL;DR

The paper tackles learning autonomous ODEs from scarce, partial, and noisy data by jointly estimating the state trajectory and the governing dynamics within a reproducing kernel Hilbert space framework. It introduces JSINDy, an all-at-once collocation method that enforces self-consistency between the estimated state and a sparse, library-based dynamics, and leverages a representer theorem to reduce to a finite-dimensional optimization solved via alternating Levenberg–Marquardt steps and sparsifying iterations. The approach demonstrates robustness across scenarios including very low sampling rates, partial observability, higher-order ODEs, and model misspecification, achieving accurate state trajectories and sparse coefficient recovery across Lorenz, Lotka–Volterra, Van der Pol, and Duffing-type systems. It provides a principled relaxation of the ODE constraint that stabilizes optimization and improves robustness to noise, while offering flexible sparsification strategies and the potential for Bayesian model selection to further enhance variable recovery. Limitations include computational cost from dense kernel matrices, suggesting future work on state-space GP representations and scalable algorithms to extend applicability to longer time horizons.

Abstract

We develop an all-at-once modeling framework for learning systems of ordinary differential equations (ODE) from scarce, partial, and noisy observations of the states. The proposed methodology amounts to a combination of sparse recovery strategies for the ODE over a function library combined with techniques from reproducing kernel Hilbert space (RKHS) theory for estimating the state and discretizing the ODE. Our numerical experiments reveal that the proposed strategy leads to significant gains in terms of accuracy, sample efficiency, and robustness to noise, both in terms of learning the equation and estimating the unknown states. This work demonstrates capabilities well beyond existing and widely used algorithms while extending the modeling flexibility of other recent developments in equation discovery.

A joint optimization approach to identifying sparse dynamics using least squares kernel collocation

TL;DR

The paper tackles learning autonomous ODEs from scarce, partial, and noisy data by jointly estimating the state trajectory and the governing dynamics within a reproducing kernel Hilbert space framework. It introduces JSINDy, an all-at-once collocation method that enforces self-consistency between the estimated state and a sparse, library-based dynamics, and leverages a representer theorem to reduce to a finite-dimensional optimization solved via alternating Levenberg–Marquardt steps and sparsifying iterations. The approach demonstrates robustness across scenarios including very low sampling rates, partial observability, higher-order ODEs, and model misspecification, achieving accurate state trajectories and sparse coefficient recovery across Lorenz, Lotka–Volterra, Van der Pol, and Duffing-type systems. It provides a principled relaxation of the ODE constraint that stabilizes optimization and improves robustness to noise, while offering flexible sparsification strategies and the potential for Bayesian model selection to further enhance variable recovery. Limitations include computational cost from dense kernel matrices, suggesting future work on state-space GP representations and scalable algorithms to extend applicability to longer time horizons.

Abstract

We develop an all-at-once modeling framework for learning systems of ordinary differential equations (ODE) from scarce, partial, and noisy observations of the states. The proposed methodology amounts to a combination of sparse recovery strategies for the ODE over a function library combined with techniques from reproducing kernel Hilbert space (RKHS) theory for estimating the state and discretizing the ODE. Our numerical experiments reveal that the proposed strategy leads to significant gains in terms of accuracy, sample efficiency, and robustness to noise, both in terms of learning the equation and estimating the unknown states. This work demonstrates capabilities well beyond existing and widely used algorithms while extending the modeling flexibility of other recent developments in equation discovery.

Paper Structure

This paper contains 33 sections, 3 theorems, 62 equations, 14 figures, 1 table.

Key Result

Theorem 2.1

Consider the optimization problem (eqn:obj-semidiscrete) for $\alpha,\beta,\lambda,\mu>0$. Let $\mathcal{X}$ be the vector-valued RKHS associated to the $d$-fold Cartesian product of an RKHS $\mathcal{H}_k$ associated to the positive definite kernel $k$. Assume that linear operators of the form $\bo

Figures (14)

  • Figure 1: Lorenz 63 system with observations sampled at $\Delta t=0.05$ until $t=10$ with added noise set to $\sigma^2=4$. Simulated dynamics from \ref{['eqn:lorenz']} are compared against true dynamics from $t=10$ until $t=15$.
  • Figure 2: Lorenz 63 system with observations sampled at $\Delta t=0.5$ until $t=10$ with no added noise. Simulated dynamics of \ref{['eqn:scarce-lorenz']} are plotted against true, unseen, trajectory from $t=10$ to $t=25$.
  • Figure 3: Results for the Lotke-Volterra system of \ref{['subsubsec:lv']}. Samples are taken at the rate $\Delta t=3$ up to $t=48$, including additive noise with $\sigma = 0.2$. Simulated dynamics are shown up to $t=100$, comparing the using the results from \ref{['eqn:learned-lv']} with the true dynamics.
  • Figure 4: Left: True Lotka-Volterra plotted from multiple initial conditions on phase portrait using \ref{['eqn:true-lv']}. Red trajectory and measurements are those from \ref{['fig:lv-results']}. Right: Using learned model \ref{['eqn:learned-lv']} to simulate same trajectories provided same initial conditions.
  • Figure 5: Results for Lorenz 63 system with streams of partial observations. Observations alternative between each coordinate every 10 samples at sampling rate $\Delta t = 0.025$ up to $t=10$ with added noise set to $\sigma^2 = 0.1$. Learned dynamics are simulated from $t=10$ until $t=20$.
  • ...and 9 more figures

Theorems & Definitions (5)

  • Theorem 2.1
  • proof
  • Theorem 2.2
  • proof
  • Theorem A.1