Turning mechanistic models into forecasters by using machine learning

Amit K. Chakraborty; Hao Wang; Pouria Ramazi

Turning mechanistic models into forecasters by using machine learning

Amit K. Chakraborty, Hao Wang, Pouria Ramazi

TL;DR

This work addresses forecasting of evolving dynamical systems by marrying data-driven discovery with time-varying parameters. It extends a SINDy-like framework to allow a subset of coefficients to change over time via sequential windowed fitting (STRR) and then forecasts these parameters with an ML model trained on covariates, notably weather variables. The learned dynamics, incorporating both constant and time-varying terms, are used as the forecaster by substituting predicted parameters into the ODEs. Theoretical finite-horizon error bounds are established, and empirical results across SIR, CR, greenhouse gas concentrations, and cyanobacteria demonstrate substantially improved learning and forecasting accuracy compared with fixed-parameter approaches and standard ML baselines, highlighting the practical value of combining mechanistic structure with data-driven parameter evolution for nonstationary systems.

Abstract

The equations of complex dynamical systems may not be identified by expert knowledge, especially if the underlying mechanisms are unknown. Data-driven discovery methods address this challenge by inferring governing equations from time-series data using a library of functions constructed from the measured variables. However, these methods typically assume time-invariant coefficients, which limits their ability to capture evolving system dynamics. To overcome this limitation, we allow some of the parameters to vary over time, learn their temporal evolution directly from data, and infer a system of equations that incorporates both constant and time-varying parameters. We then transform this framework into a forecasting model by predicting the time-varying parameters and substituting these predictions into the learned equations. The model is validated using datasets for Susceptible-Infected-Recovered, Consumer--Resource, greenhouse gas concentration, and Cyanobacteria cell count. By dynamically adapting to temporal shifts, our proposed model achieved a mean absolute error below 3\% for learning a time series and below 6\% for forecasting up to a month ahead. We additionally compare forecasting performance against CNN-LSTM and Gradient Boosting Machine (GBM), and show that our model outperforms these methods across most datasets. Our findings demonstrate that integrating time-varying parameters into data-driven discovery of differential equations improves both modeling accuracy and forecasting performance.

Turning mechanistic models into forecasters by using machine learning

TL;DR

Abstract

Paper Structure (22 sections, 4 theorems, 58 equations, 11 figures, 6 tables)

This paper contains 22 sections, 4 theorems, 58 equations, 11 figures, 6 tables.

Introduction
Methods
Problem formulation
Discovering time-varying dynamics
Forecasting from learned dynamics
Case studies
SIR model
CR model
Gas concentration
Cyanobacteria cell count
Simulated weather data
ML models, error metrics, and cross-validation
Results
Theoretical guarantees
Finite-horizon forecast error bounds
...and 7 more sections

Key Result

Theorem 4.1.1

Suppose Assumptions ass:lipschitz and ass:param-err hold and let $\delta = \varepsilon_{\mathrm{lib}} + B_{\Theta}\varepsilon_{\mathrm{par}}$. Let $\mathbf{x}:[0,H]\to D$ solve the true system $\dot{\mathbf{x}}(t)=\mathbf{f}(t,\mathbf{x}(t))$ with initial condition $\mathbf{x}(0)=\mathbf{x}_0$, and

Figures (11)

Figure 1: Schematic of data-driven discovery of governing equations and forecasting from the learned system. a) Time-series data of variables are gathered from natural or simulated systems. b) A sparse regression problem is solved over the whole time series to identify active terms and constant coefficients. Variables in gray represent non-active terms, while variables in blue represent active terms with constant coefficients. Variables in boxes represent generated terms. c) Top N candidate terms are chosen based on the correlation of active candidate terms with the derivative of the state variables. The coefficients of these terms are treated as time-varying parameters. By default, the bias term is also considered time-varying. d) Time series data is split into intervals, and sparse regression is performed within each interval to determine the time-varying parameters of the top candidate terms. Purple represents top candidate terms and their time-varying parameters, while blue indicates terms with constant coefficients. e) A ML model is employed to forecast the time-varying parameters. Inputs are relevant predictors, and outputs are the time-varying parameters.
Figure 2: Illustration of the sequential, non-overlapping training, testing, and validation scheme used in this study. The training period (red) covers the majority of the time series. The validation window blocks (blue), equal in length to the test window, are positioned immediately before the test period (yellow). For each fold of the test, only the validation window is expanded forward while the training set remains fixed, producing multiple validation folds without expanding or overlapping the training data.
Figure 3: Average learning MAEs for the (a) SIR dataset, (b) CR dataset, (c) gases dataset, and (d) CB dataset. Green bars represent the time-varying parameter model, and orange bars represent the fixed-parameter model. MAEs were computed for each variable across different noise levels or monitoring stations within each dataset and then averaged.
Figure 4: Average forecasting MAEs using the expanding-window cross-validation approach for the (a) SIR dataset, (b) CR dataset, (c) gases dataset, and (d) CB dataset. Green bars represent the time-varying parameter model, orange bars represent the fixed-parameter model, red bars represent the CNN-LSTM model, and purple bars represent the GBM model. MAEs were computed for each variable across different noise levels or monitoring stations within each dataset and then averaged.
Figure 5: Average forecasting MAEs by the time-varying parameter model based on optimal configuration and cross-validation for the (a) SIR dataset, (b) CR dataset, (c) gases dataset, and (d) CB dataset. Green bars represent the error by the model based on the cross-validation (CV) method, and pink bars represent the optimal configuration forecasting error of the model. MAEs were computed for each variable across different noise levels or monitoring stations within each dataset and then averaged.
...and 6 more figures

Theorems & Definitions (11)

Example 2.1.1
Theorem 4.1.1: Finite-horizon forecast error bound
proof
Theorem 4.1.2: Split-model forecast bound: only $\Theta_{\mathrm{top}N}$ forecast errors enter
proof
Remark 4.1.1
Definition 4.1.1: Best uniform coefficient approximation errors
Lemma 4.1.1: Step-function approximation beats constant approximation
proof
Theorem 4.1.3: When time-varying parameters dominate fixed parameters
...and 1 more

Turning mechanistic models into forecasters by using machine learning

TL;DR

Abstract

Turning mechanistic models into forecasters by using machine learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (11)