Impilict Runge-Kutta based sparse identification of governing equations in biologically motivated systems
Mehrdad Anvari, Hamidreza Marasi, Hossein Kheiri
TL;DR
This work addresses the challenge of identifying governing differential equations from sparse, noisy data in biological and physical systems. It introduces IRK-SINDy, which couples high-order implicit Runge-Kutta methods with sparse regression, and adds a deep IRK-SINDy variant that uses a neural network to predict IRK stage values for efficient, derivative-free learning. The approach yields parsimonious, interpretable models and shows superior robustness to data scarcity and noise across a suite of benchmarks including linear/cubic oscillators, Lorenz, predator-prey, logistic growth, and FitzHugh–Nagumo, outperforming conventional SINDy and RK4-SINDy. The work advances data-driven model discovery for biology by leveraging A-stable IRKs and neural stage predictors, with plans to extend to other high-order implicit methods and to provide code.
Abstract
Identifying governing equations in physical and biological systems from datasets remains a long-standing challenge across various scientific disciplines, providing mechanistic insights into complex system evolution. Common methods like sparse identification of nonlinear dynamics (SINDy) often rely on precise derivative estimations, making them vulnerable to data scarcity and noise. This study presents a novel data-driven framework by integrating high order implicit Runge-Kutta methods (IRKs) with the sparse identification, termed IRK-SINDy. The framework exhibits remarkable robustness to data scarcity and noise by leveraging the lower stepsize constraint of IRKs. Two methods for incorporating IRKs into sparse regression are introduced: one employs iterative schemes for numerically solving nonlinear algebraic system of equations, while the other utilizes deep neural networks to predict stage values of IRKs. The performance of IRK-SINDy is demonstrated through numerical experiments on benchmark problems with varied dynamical behaviors, including linear and nonlinear oscillators, the Lorenz system, and biologically relevant models like predator-prey dynamics, logistic growth, and the FitzHugh-Nagumo model. Results indicate that IRK-SINDy outperforms conventional SINDy and the RK4-SINDy framework, particularly under conditions of extreme data scarcity and noise, yielding interpretable and generalizable models.
