Table of Contents
Fetching ...

Topological Approach for Data Assimilation

Max M. Chumley, Firas A. Khasawneh

TL;DR

This paper introduces TADA, a noise-agnostic data assimilation framework that blends random feature map forecasting with topological data analysis. By differentiating through persistence diagrams and minimizing Wasserstein-based topological differences between forecast and observations, TADA updates the model without requiring measurement noise statistics. The method is validated on chaotic Lorenz 63 and extended to Lorenz 96, showing robustness to white, pink, and Brownian noise and scalability to higher dimensions, albeit with careful hyperparameter tuning. The work provides code for reproducibility and highlights a potentially impactful approach for topology-informed data assimilation in complex dynamical systems.

Abstract

Many dynamical systems are difficult or impossible to model using high fidelity physics based models. Consequently, researchers are relying more on data driven models to make predictions and forecasts. Based on limited training data, machine learning models often deviate from the true system states over time and need to be continually updated as new measurements are taken using data assimilation. Classical data assimilation algorithms typically require knowledge of the measurement noise statistics which may be unknown. In this paper, we introduce a new data assimilation algorithm with a foundation in topological data analysis. By leveraging the differentiability of functions of persistence, gradient descent optimization is used to minimize topological differences between measurements and forecast predictions by tuning data driven model coefficients without using noise information from the measurements. We describe the method and focus on its capabilities performance using the chaotic Lorenz 63 system as an example and we also show that the method works on a higher dimensional example with the Lorenz 96 system.

Topological Approach for Data Assimilation

TL;DR

This paper introduces TADA, a noise-agnostic data assimilation framework that blends random feature map forecasting with topological data analysis. By differentiating through persistence diagrams and minimizing Wasserstein-based topological differences between forecast and observations, TADA updates the model without requiring measurement noise statistics. The method is validated on chaotic Lorenz 63 and extended to Lorenz 96, showing robustness to white, pink, and Brownian noise and scalability to higher dimensions, albeit with careful hyperparameter tuning. The work provides code for reproducibility and highlights a potentially impactful approach for topology-informed data assimilation in complex dynamical systems.

Abstract

Many dynamical systems are difficult or impossible to model using high fidelity physics based models. Consequently, researchers are relying more on data driven models to make predictions and forecasts. Based on limited training data, machine learning models often deviate from the true system states over time and need to be continually updated as new measurements are taken using data assimilation. Classical data assimilation algorithms typically require knowledge of the measurement noise statistics which may be unknown. In this paper, we introduce a new data assimilation algorithm with a foundation in topological data analysis. By leveraging the differentiability of functions of persistence, gradient descent optimization is used to minimize topological differences between measurements and forecast predictions by tuning data driven model coefficients without using noise information from the measurements. We describe the method and focus on its capabilities performance using the chaotic Lorenz 63 system as an example and we also show that the method works on a higher dimensional example with the Lorenz 96 system.

Paper Structure

This paper contains 21 sections, 7 equations, 15 figures, 1 table.

Figures (15)

  • Figure 1: Data Assimilation Concept---Improving model results by considering observations to obtain an estimation closer to the ground truth.
  • Figure 2: Persistence for point clouds. Each snapshot in (a)--(e) shows the rips complex for increasing values of a disc of radius $\epsilon$. One prominent loop is formed (or born) at $\epsilon=b_6$ in (e), and fills in (or dies) in (f) when $\epsilon=d_6$. The other 5 loops are small and a result of noise so they are born and die at nearly equivalent values of $\epsilon$. These loops are represented in the 1D persistence diagram (g) as (birth,death) pairs. Non-prominent loops form and die quickly as shown by the points near the diagonal.
  • Figure 3: Mapping a point cloud $\theta$ to a real values persistence feature using the map composition $V\circ B$.
  • Figure 4: Persistence diagram differentiation process. The top row shows the process of tracking the birth and death of the loop from the original point cloud along with using the map $B$ to obtain its persistence diagram. The bottom row performs the same process on a perturbed point cloud and demonstrates how the change in the persistence pair forms the derivative $d_{\theta,\tilde{B}}B$.
  • Figure 5: Persistence differentiation for point clouds. The point cloud $P$ is perturbed to $P'$ and the 0D persistence diagram is differentiated with respect to this perturbation.
  • ...and 10 more figures