Table of Contents
Fetching ...

Towards Robust Data-Driven Automated Recovery of Symbolic Conservation Laws from Limited Data

Tracey Oellerich, Maria Emelianenko

TL;DR

This work tackles recovering conservation laws from data without prior dynamical knowledge by casting the problem as a null-space search of a derivative library $\Gamma$ via SVD. A robust, automated framework selects an optimal candidate library $\Theta$ by analyzing singular-value gaps, enabling identification of the number and form of conserved quantities from limited, noisy data. The method demonstrates accurate recovery on multiple benchmarks, including nonlinear and multi-law systems, and extends to a MAPK pathway, highlighting practical applicability. Perturbation analysis and derivative-estimation strategies underpin data-efficiency and stability, while the framework remains compatible with other learning approaches. Overall, the approach offers a principled, data-driven route to symbolic conservation laws with explicit guidance on data requirements and noise handling.

Abstract

Conservation laws are an inherent feature in many systems modeling real world phenomena, in particular, those modeling biological and chemical systems. If the form of the underlying dynamical system is known, linear algebra and algebraic geometry methods can be used to identify the conservation laws. Our work focuses on using data-driven methods to identify the conservation law(s) in the absence of the knowledge of system dynamics. Building in part upon the ideas proposed in [arXiv:1811.00961], we develop a robust data-driven computational framework that automates the process of identifying the number and type of the conservation law(s) while keeping the amount of required data to a minimum. We demonstrate that due to relative stability of singular vectors to noise we are able to reconstruct correct conservation laws without the need for excessive parameter tuning. While we focus primarily on biological examples, the framework proposed herein is suitable for a variety of data science applications and can be coupled with other machine learning approaches.

Towards Robust Data-Driven Automated Recovery of Symbolic Conservation Laws from Limited Data

TL;DR

This work tackles recovering conservation laws from data without prior dynamical knowledge by casting the problem as a null-space search of a derivative library via SVD. A robust, automated framework selects an optimal candidate library by analyzing singular-value gaps, enabling identification of the number and form of conserved quantities from limited, noisy data. The method demonstrates accurate recovery on multiple benchmarks, including nonlinear and multi-law systems, and extends to a MAPK pathway, highlighting practical applicability. Perturbation analysis and derivative-estimation strategies underpin data-efficiency and stability, while the framework remains compatible with other learning approaches. Overall, the approach offers a principled, data-driven route to symbolic conservation laws with explicit guidance on data requirements and noise handling.

Abstract

Conservation laws are an inherent feature in many systems modeling real world phenomena, in particular, those modeling biological and chemical systems. If the form of the underlying dynamical system is known, linear algebra and algebraic geometry methods can be used to identify the conservation laws. Our work focuses on using data-driven methods to identify the conservation law(s) in the absence of the knowledge of system dynamics. Building in part upon the ideas proposed in [arXiv:1811.00961], we develop a robust data-driven computational framework that automates the process of identifying the number and type of the conservation law(s) while keeping the amount of required data to a minimum. We demonstrate that due to relative stability of singular vectors to noise we are able to reconstruct correct conservation laws without the need for excessive parameter tuning. While we focus primarily on biological examples, the framework proposed herein is suitable for a variety of data science applications and can be coupled with other machine learning approaches.
Paper Structure (19 sections, 4 theorems, 21 equations, 8 figures, 18 tables, 1 algorithm)

This paper contains 19 sections, 4 theorems, 21 equations, 8 figures, 18 tables, 1 algorithm.

Key Result

Theorem 4.1

For any additive perturbation $\tilde{A} = A + \mathcal{E}$, the following bound on singular values $\sigma_i(A)$ is valid:

Figures (8)

  • Figure 2.1: Flowchart detailing the process of identifying conservation laws from data.
  • Figure 4.1: Log-Log plots displaying the effect of noise to the (Left) singular values, $\sigma_i$, and (Right) the corresponding $\lVert \Gamma v_i\rVert$. Each example is computed for the library corresponding to the known conservation law, or the linear library in the example having no conservation. Errors corresponding to $\varepsilon_{\textbf{x}}$,$\varepsilon_{\dot{\textbf{x}}}$, $\mathcal{E}_{\Gamma}$ will be shown for comparison.
  • Figure 5.1: Flowchart detailing the process for selecting the optimal $\Theta$-library. For each potential library, singular values and vectors will be recorded and count and $\delta$ are calculated. The optimal $\Theta$ will correspond to the system with $\textrm{count} \in (0,n]$ and the largest $\delta$. Once those have been selected, the corresponding singular vector(s) can be defined as the coefficients for the functions in the chosen $\Theta$-library.
  • Figure 6.1: Singular values corresponding to different library configurations for Example 1 with $N$ points and given noise variance.
  • Figure 6.2: Singular values corresponding to different library configurations for Example 2 with $N$ points and given noise variance.
  • ...and 3 more figures

Theorems & Definitions (12)

  • Remark 1
  • Remark 2
  • Remark 3
  • Theorem 4.1: Weyl WeylLawson95
  • Corollary 4.1: Weyl-type theorem for $\Gamma$ matrix
  • proof
  • Corollary 4.2
  • Theorem 4.2: Cai
  • Remark 4
  • Remark 5
  • ...and 2 more