Table of Contents
Fetching ...

SODAs: Sparse Optimization for the Discovery of Differential and Algebraic Equations

Manu Jayadharan, Christina Catlett, Arthur N. Montanari, Niall M. Mangan

TL;DR

SODAs introduces a principled framework for data-driven discovery of differential-algebraic equations by separating the identification of algebraic constraints from the differential dynamics, thereby preserving physical structure and mitigating multicollinearity. The method iteratively refines a library of candidate terms using sparse regression to uncover algebraic relations, uses singular value decomposition to guide stopping, and then applies conventional ODE discovery to the remaining dynamics on a refined library. Across chemical reaction networks, pendulum systems, and power-grid models, SODAs demonstrates robust recovery of DAEs and reduced coordinates under realistic noise and data limitations, outperforming implicit-SINDy-like approaches that rely on a priori variable identification. The work provides a scalable, interpretable route to model discovery in complex, constrained dynamical systems with practical implications for biology, mechanics, and energy networks.

Abstract

Differential-algebraic equations (DAEs) integrate ordinary differential equations (ODEs) with algebraic constraints, providing a fundamental framework for developing models of dynamical systems characterized by timescale separation, conservation laws, and physical constraints. While sparse optimization has revolutionized model development by allowing data-driven discovery of parsimonious models from a library of possible equations, existing approaches for dynamical systems assume DAEs can be reduced to ODEs by eliminating variables before model discovery. This assumption limits the applicability of such methods to DAE systems with unknown constraints and time scales. We introduce Sparse Optimization for Differential-Algebraic Systems (SODAs), a data-driven method for the identification of DAEs in their explicit form. By discovering the algebraic and dynamic components sequentially without prior identification of the algebraic variables, this approach leads to a sequence of convex optimization problems and has the advantage of discovering interpretable models that preserve the structure of the underlying physical system. To this end, SODAs improves numerical stability when handling high correlations between library terms -- caused by near-perfect algebraic relationships -- by iteratively refining the conditioning of the candidate library. We demonstrate the performance of our method on biological, mechanical, and electrical systems, showcasing its robustness to noise in both simulated time series and real-time experimental data.

SODAs: Sparse Optimization for the Discovery of Differential and Algebraic Equations

TL;DR

SODAs introduces a principled framework for data-driven discovery of differential-algebraic equations by separating the identification of algebraic constraints from the differential dynamics, thereby preserving physical structure and mitigating multicollinearity. The method iteratively refines a library of candidate terms using sparse regression to uncover algebraic relations, uses singular value decomposition to guide stopping, and then applies conventional ODE discovery to the remaining dynamics on a refined library. Across chemical reaction networks, pendulum systems, and power-grid models, SODAs demonstrates robust recovery of DAEs and reduced coordinates under realistic noise and data limitations, outperforming implicit-SINDy-like approaches that rely on a priori variable identification. The work provides a scalable, interpretable route to model discovery in complex, constrained dynamical systems with practical implications for biology, mechanics, and energy networks.

Abstract

Differential-algebraic equations (DAEs) integrate ordinary differential equations (ODEs) with algebraic constraints, providing a fundamental framework for developing models of dynamical systems characterized by timescale separation, conservation laws, and physical constraints. While sparse optimization has revolutionized model development by allowing data-driven discovery of parsimonious models from a library of possible equations, existing approaches for dynamical systems assume DAEs can be reduced to ODEs by eliminating variables before model discovery. This assumption limits the applicability of such methods to DAE systems with unknown constraints and time scales. We introduce Sparse Optimization for Differential-Algebraic Systems (SODAs), a data-driven method for the identification of DAEs in their explicit form. By discovering the algebraic and dynamic components sequentially without prior identification of the algebraic variables, this approach leads to a sequence of convex optimization problems and has the advantage of discovering interpretable models that preserve the structure of the underlying physical system. To this end, SODAs improves numerical stability when handling high correlations between library terms -- caused by near-perfect algebraic relationships -- by iteratively refining the conditioning of the candidate library. We demonstrate the performance of our method on biological, mechanical, and electrical systems, showcasing its robustness to noise in both simulated time series and real-time experimental data.

Paper Structure

This paper contains 21 sections, 36 equations, 11 figures, 1 algorithm.

Figures (11)

  • Figure 1: SODAs algorithm explained using a chemical reaction network example. (a) Collect time series and construct candidate library matrix. (b) Algebraic Relation Finder: iteratively finds algebraic relationships and refines the candidate library based on the discovered relationship. In each iteration $k$, the best relationship is selected based on its score $S^k$, and a complexity score is assigned to each term within this relationship. This complexity score is used to select a term to refine the candidate library for the next iteration. (c) Dynamic Finder: the refined candidate library from the algebraic finder step is used to find the system of ODEs. (d) Assembled DAEs: The algebraic and differential equations from the algebraic and dynamic finder steps respectively, are assembled to form the final set of DAEs (left), which is equivalent to the true set of DAEs (right).
  • Figure 2: SVD analysis of the candidate library $\Theta$: (a) library with monomials up to degree 2 and no noise, (b) library with monomials up to degree 2 and 15% noise, (c) library with monomials up to degree 4 and 15% noise, (d) tracking condition number after each refinement for library with monomials up to degree 4 and 15% noise. SVD analysis helps track redundancies in the candidate library space and determines when to stop iterations in the algebraic finder step.
  • Figure 3: Example 1: Application to chemical reaction networks. (a) Nested structure of CRN1 and CRN2 within CRN3 and examples of simulated time-series with $5\%$ noise. Note that only CRN3 allows reversibility of $C \to BE_2$ as denoted by the green arrow. (b) Scaling of the data required for recovery of the correct algebraic equations in CRN1 across library degree $p$ and cardinality $|\Theta|$. Non-robust recovery is denoted when fewer than $80\%$ of random seeds succeeded in recovering the correct algebraic relationships. Error bars represent standard error across 10 random noise seeds. (c) Scaling of the data required for recovery of the correct algebraic equations for all CRNs (cardinalities noted) with fixed library degree $p=2$.
  • Figure 4: Example 3: Application to non-linear pendulum. (a) Schematic diagram: single pendulum (top) and double pendulum (bottom). (b) Scaled pixel data: single pendulum experiments (left) and double pendulum animation (right). (c) Data requirement: comparison between damped single pendulum and double pendulum. (d) SVD analysis: determining the number of algebraic constraints in the case of the chaotic double pendulum with degree 5 library.
  • Figure 5: Discovery of power-grid networks. (a) Power-grid networks of the IEEE-4 (top left), IEEE-9 (bottom left), and IEEE-39 (right) benchmark systems. Generator buses are represented by blue nodes on the outer circle, while aggregated load buses are represented by gray nodes. (b) Time series of phase dynamics $\phi_i(t)$ for all nodes in the IEEE-39 power grid under different levels of perturbations: large perturbations (top), large perturbations zoomed-in (middle), and small perturbations zoomed-in (bottom). (c) Performance comparison of SODAs applied to IEEE-39 benchmark system across different SNR levels and : small perturbations (left) and large perturbations (right). (d) Performance comparison of SODAs at 30 dB SNR across: different benchmark cases (left) and different perturbation strengths (right).
  • ...and 6 more figures