Towards Robust Data-Driven Automated Recovery of Symbolic Conservation Laws from Limited Data
Tracey Oellerich, Maria Emelianenko
TL;DR
This work tackles recovering conservation laws from data without prior dynamical knowledge by casting the problem as a null-space search of a derivative library $\Gamma$ via SVD. A robust, automated framework selects an optimal candidate library $\Theta$ by analyzing singular-value gaps, enabling identification of the number and form of conserved quantities from limited, noisy data. The method demonstrates accurate recovery on multiple benchmarks, including nonlinear and multi-law systems, and extends to a MAPK pathway, highlighting practical applicability. Perturbation analysis and derivative-estimation strategies underpin data-efficiency and stability, while the framework remains compatible with other learning approaches. Overall, the approach offers a principled, data-driven route to symbolic conservation laws with explicit guidance on data requirements and noise handling.
Abstract
Conservation laws are an inherent feature in many systems modeling real world phenomena, in particular, those modeling biological and chemical systems. If the form of the underlying dynamical system is known, linear algebra and algebraic geometry methods can be used to identify the conservation laws. Our work focuses on using data-driven methods to identify the conservation law(s) in the absence of the knowledge of system dynamics. Building in part upon the ideas proposed in [arXiv:1811.00961], we develop a robust data-driven computational framework that automates the process of identifying the number and type of the conservation law(s) while keeping the amount of required data to a minimum. We demonstrate that due to relative stability of singular vectors to noise we are able to reconstruct correct conservation laws without the need for excessive parameter tuning. While we focus primarily on biological examples, the framework proposed herein is suitable for a variety of data science applications and can be coupled with other machine learning approaches.
