Table of Contents
Fetching ...

Local Sequential MCMC for Data Assimilation with Applications in Geoscience

Hamza Ruzayqat, Omar Knio

TL;DR

This paper designs a localization approach within the SMCMC framework that focuses on regions where observations are located and restricts the transition densities included in the filtering distribution of the state to these regions, which results in immensely reducing the effective degrees of freedom and thus improving the efficiency.

Abstract

This paper presents a new data assimilation (DA) scheme based on a sequential Markov Chain Monte Carlo (SMCMC) DA technique [Ruzayqat et al. 2024] which is provably convergent and has been recently used for filtering, particularly for high-dimensional non-linear, and potentially, non-Gaussian state-space models. Unlike particle filters, which can be considered exact methods and can be used for filtering non-linear, non-Gaussian models, SMCMC does not assign weights to the samples/particles, and therefore, the method does not suffer from the issue of weight-degeneracy when a relatively small number of samples is used. We design a localization approach within the SMCMC framework that focuses on regions where observations are located and restricts the transition densities included in the filtering distribution of the state to these regions. This results in immensely reducing the effective degrees of freedom and thus improving the efficiency. We test the new technique on high-dimensional ($d \sim 10^4 - 10^5$) linear Gaussian model and non-linear shallow water models with Gaussian noise with real and synthetic observations. For two of the numerical examples, the observations mimic the data generated by the Surface Water and Ocean Topography (SWOT) mission led by NASA, which is a swath of ocean height observations that changes location at every assimilation time step. We also use a set of ocean drifters' real observations in which the drifters are moving according the ocean kinematics and assumed to have uncertain locations at the time of assimilation. We show that when higher accuracy is required, the proposed algorithm is superior in terms of efficiency and accuracy over competing ensemble methods and the original SMCMC filter.

Local Sequential MCMC for Data Assimilation with Applications in Geoscience

TL;DR

This paper designs a localization approach within the SMCMC framework that focuses on regions where observations are located and restricts the transition densities included in the filtering distribution of the state to these regions, which results in immensely reducing the effective degrees of freedom and thus improving the efficiency.

Abstract

This paper presents a new data assimilation (DA) scheme based on a sequential Markov Chain Monte Carlo (SMCMC) DA technique [Ruzayqat et al. 2024] which is provably convergent and has been recently used for filtering, particularly for high-dimensional non-linear, and potentially, non-Gaussian state-space models. Unlike particle filters, which can be considered exact methods and can be used for filtering non-linear, non-Gaussian models, SMCMC does not assign weights to the samples/particles, and therefore, the method does not suffer from the issue of weight-degeneracy when a relatively small number of samples is used. We design a localization approach within the SMCMC framework that focuses on regions where observations are located and restricts the transition densities included in the filtering distribution of the state to these regions. This results in immensely reducing the effective degrees of freedom and thus improving the efficiency. We test the new technique on high-dimensional () linear Gaussian model and non-linear shallow water models with Gaussian noise with real and synthetic observations. For two of the numerical examples, the observations mimic the data generated by the Surface Water and Ocean Topography (SWOT) mission led by NASA, which is a swath of ocean height observations that changes location at every assimilation time step. We also use a set of ocean drifters' real observations in which the drifters are moving according the ocean kinematics and assumed to have uncertain locations at the time of assimilation. We show that when higher accuracy is required, the proposed algorithm is superior in terms of efficiency and accuracy over competing ensemble methods and the original SMCMC filter.
Paper Structure (11 sections, 1 theorem, 41 equations, 8 figures, 3 tables, 5 algorithms)

This paper contains 11 sections, 1 theorem, 41 equations, 8 figures, 3 tables, 5 algorithms.

Key Result

Proposition A.1

Under ass:A, for any $k\geq 1$, data $\mathbf{Y}_{t_{1:k}}$, and $p\geq 1$, there exists a constant $C_{p,k}(\beta_k,\mathbf{Y}_{t_{1:k}}) <\infty$ such that for any $\varphi_k$ bounded and Borel function on $A_k$:

Figures (8)

  • Figure 1: Schematic showing a $10 \times 10$ square grid with nine subdomains labeled from 0 to 8 with a convention that the edges of a subdomain that contain points are the west and south edges. The black dots represent the locations of the observations. The subdomains labeled 0, 3, 4, 7 and 8 are the ones that contain observational locations and contain 55 grid points saved in $\overline{\mathbf{x}}_{t_k}$. Note that subdomains 3 and 7 only contain one observational location each, namely, $x_{23}$ and $x_{56}$, respectively.
  • Figure 2: These images show the swaths of observations at different times for the linear Gaussian model. The shots show the observations at times 6, 11 and 14 (in blue), respectively, and the swaths of the observations at previous times (in red). In the assimilation step, only the blue swath of data is used.
  • Figure 3: Snapshots that show the filter mean at coordinate 50 of the different filtering schemes for the linear Gaussian model. The blue line is the true filter mean computed via KF.
  • Figure 4: Snapshot of the different filters means for the linear Gaussian model at time 22 using the same parameters as in \ref{['tab:table1']}. The KF is on the very left, the second column is the filters means and the last column on the right is differences.
  • Figure 5: SWEs SSM with SWOT-like observations: Snapshot of the LSMCMC and SMCMC filters means at observational time 26.67hrs (that is $k=800$). The first row shows the signal parts $(u,v,\eta)$. The second row shows the mean of the LSMCMC filter for these three parts. The third row shows the mean of the SMCMC filter. Finally, fourth and fifth rows show the difference between the reference and the LSMCMC and SMCMC filters means, respectively. The red dashed lines show the boundary of the observed region of the water's height at $k=800$.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Proposition A.1