Table of Contents
Fetching ...

EKF-SINDy: Empowering the extended Kalman filter with sparse identification of nonlinear dynamics

Luca Rosafalco, Paolo Conti, Andrea Manzoni, Stefano Mariani, Attilio Frangi

TL;DR

This work introduces EKF-SINDy, a data-driven framework that integrates an Extended Kalman Filter with Sparse Identification of Nonlinear Dynamics to identify nonlinear system dynamics and parameters from noisy, potentially partial observations. By training a SINDy model offline and evolving it in the EKF prediction step, the method yields a computationally efficient surrogate that also provides Jacobians needed by the EKF, enabling joint state and parameter estimation for autonomous and non-autonomous systems. The approach is demonstrated on a seismic shear-building model and a partially observed nonlinear resonator, with time-delay embedding enabling recovery of hidden states and robust estimation of stiffness and coupling parameters; results show accurate tracking, uncertainty quantification, and even effective performance when operating outside the SINDy training range. Overall, EKF-SINDy advances real-time digital-twin development by combining physics-informed sparsity with data assimilation, offering a scalable, interpretable alternative to fully black-box neural models for nonlinear dynamics.

Abstract

Measured data from a dynamical system can be assimilated into a predictive model by means of Kalman filters. Nonlinear extensions of the Kalman filter, such as the Extended Kalman Filter (EKF), are required to enable the joint estimation of (possibly nonlinear) system dynamics and of input parameters. To construct the evolution model used in the prediction phase of the EKF, we propose to rely on the Sparse Identification of Nonlinear Dynamics (SINDy). SINDy enables to identify the evolution model directly from preliminary acquired data, thus avoiding possible bias due to wrong assumptions and incorrect modelling of the system dynamics. Moreover, the numerical integration of a SINDy model leads to great computational savings compared to alternate strategies based on, e.g., finite elements. Last, SINDy allows an immediate definition of the Jacobian matrices required by the EKF to identify system dynamics and properties, a derivation that is usually extremely involved with physical models. As a result, combining the EKF with SINDy provides a data-driven computationally efficient, easy-to-apply approach for the identification of nonlinear systems, capable of robust operation even outside the range of training of SINDy. To demonstrate the potential of the approach, we address the identification of a linear non-autonomous system consisting of a shear building model excited by real seismograms, and the identification of a partially observed nonlinear system. The challenge arising from the use of SINDy when the system state is not entirely accessible has been relieved by means of time-delay embedding. The great accuracy and the small uncertainty associated with the state identification, where the state has been augmented to include system properties, underscores the great potential of the proposed strategy, paving the way for the setting of predictive digital twins in different fields.

EKF-SINDy: Empowering the extended Kalman filter with sparse identification of nonlinear dynamics

TL;DR

This work introduces EKF-SINDy, a data-driven framework that integrates an Extended Kalman Filter with Sparse Identification of Nonlinear Dynamics to identify nonlinear system dynamics and parameters from noisy, potentially partial observations. By training a SINDy model offline and evolving it in the EKF prediction step, the method yields a computationally efficient surrogate that also provides Jacobians needed by the EKF, enabling joint state and parameter estimation for autonomous and non-autonomous systems. The approach is demonstrated on a seismic shear-building model and a partially observed nonlinear resonator, with time-delay embedding enabling recovery of hidden states and robust estimation of stiffness and coupling parameters; results show accurate tracking, uncertainty quantification, and even effective performance when operating outside the SINDy training range. Overall, EKF-SINDy advances real-time digital-twin development by combining physics-informed sparsity with data assimilation, offering a scalable, interpretable alternative to fully black-box neural models for nonlinear dynamics.

Abstract

Measured data from a dynamical system can be assimilated into a predictive model by means of Kalman filters. Nonlinear extensions of the Kalman filter, such as the Extended Kalman Filter (EKF), are required to enable the joint estimation of (possibly nonlinear) system dynamics and of input parameters. To construct the evolution model used in the prediction phase of the EKF, we propose to rely on the Sparse Identification of Nonlinear Dynamics (SINDy). SINDy enables to identify the evolution model directly from preliminary acquired data, thus avoiding possible bias due to wrong assumptions and incorrect modelling of the system dynamics. Moreover, the numerical integration of a SINDy model leads to great computational savings compared to alternate strategies based on, e.g., finite elements. Last, SINDy allows an immediate definition of the Jacobian matrices required by the EKF to identify system dynamics and properties, a derivation that is usually extremely involved with physical models. As a result, combining the EKF with SINDy provides a data-driven computationally efficient, easy-to-apply approach for the identification of nonlinear systems, capable of robust operation even outside the range of training of SINDy. To demonstrate the potential of the approach, we address the identification of a linear non-autonomous system consisting of a shear building model excited by real seismograms, and the identification of a partially observed nonlinear system. The challenge arising from the use of SINDy when the system state is not entirely accessible has been relieved by means of time-delay embedding. The great accuracy and the small uncertainty associated with the state identification, where the state has been augmented to include system properties, underscores the great potential of the proposed strategy, paving the way for the setting of predictive digital twins in different fields.
Paper Structure (21 sections, 30 equations, 12 figures, 4 tables, 2 algorithms)

This paper contains 21 sections, 30 equations, 12 figures, 4 tables, 2 algorithms.

Figures (12)

  • Figure 1: EKF-SINDy methodology. In the prediction phase, the dynamical system identified by SINDy, together with random walk equation modelling the parameter evolution, advances the augmented state $\boldsymbol{\varkappa}=[\mathbf{x}^{\top},\boldsymbol{\phi}^{\top}]^{\top}$—comprising system state $\mathbf{x}$ and parameters $\boldsymbol{\phi}$—to a prior estimate $\boldsymbol{\varkappa}^{-}$; the correction phase involves the EKF assimilating system observations $\bar{\mathbf{y}}$, resulting in the updated augmented state $\boldsymbol{\varkappa}^{+}$.
  • Figure 2: Time-delay emebdding procedure. Over the offline phase, time-series observations of the system are time-delay embedded to form the Hankel matrix $\mathbf{A}$. The time-delay coordinates are then obtained by projecting the delayed signals onto the dominant $\eta$ left-singular values of the SVD decomposition (ı.e., columns of $\tilde{\mathbf{U}}$) and rescaled by dividing them by the corresponding singular values, stored in $\tilde{\mathbf{S}}$. Finally, SINDy is trained offline on these time-delay variables, approximating the full state space variables of the system under consideration (we stressed this aspect by indicating $\eta=n$, even though in general $\eta$ could be different from $n$). Once trained offline, SINDy could be employed in the online prediction phase to forecast the evolution of the time-delay variables, from which the observations can be reconstructed by back-projection (ı.e. by multiplication for $\tilde{\mathbf{U}}\tilde{\mathbf{S}}$). Then the reconstructed observations can be simply extracted by looking at the first column, which contains the unshifted reconstructed signal, and compared to the actual testing observations (assimilated online) to perform the final correction phase. It follows that state-to-observation map, $\mathbf{h}$, is straightforwardly obtained as $\mathbf{h}=\mathbf{e}_1 ^\top\tilde{\mathbf{U}}\tilde{\mathbf{S}}$, where $\mathbf{e}_1$ is the basis vector extracting the first column of $\tilde{\mathbf{U}}\tilde{\mathbf{S}}$.
  • Figure 3: Shear building model. Horizontal displacements, velocities, and accelerations at the two floors are recorded.
  • Figure 4: Shear building. In the top graph, the estimated parameter $\hat{k}$ is plotted in red against the target $\bar{k}$ in black. The red shaded area represents $95\%$ confidence interval of the estimates, determined using the posterior covariance. In the other graphs, the evolution of the system response tracked by the filter is plotted in red against the real system dynamics in black. For the sake of presentation, the outcomes of the $60$ s analyses are truncated after $40$ s.
  • Figure 5: Shear building. Zoom in of Fig. \ref{['fig:2_dof_results']}: comparison between the evolution in time of the system response tracked by the filter (red dotted lines) and target one (black lines) for incorrect (on the left) and updated (on the right) value of $k$. Red shaded areas represent the $95\%$ confidence intervals determined using the posterior covariance. Black solid lines represent the noise corrupted signals, while black dotted lines their uncorrupted version.
  • ...and 7 more figures