Table of Contents
Fetching ...

Prequential posteriors

Shreya Sinha-Roy, Richard G. Everitt, Christian P. Robert, Ritabrata Dutta

TL;DR

This work introduces prequential posteriors as a likelihood-free Bayesian framework for data assimilation in deep generative forecasting models, addressing the challenge of intractable likelihoods under temporal dependencies and model misspecification. By adopting a predictive-sequential loss and a Bernstein–von Mises-type analysis, the authors establish predictive consistency and posterior concentration around predictive-optimal parameters, even when the true data-generating process lies outside the model class. They implement a scalable wastefree Sequential Monte Carlo scheme with a preconditioned forward kernel to efficiently explore high-dimensional parameter spaces typical of DGFMs. The approach is validated on synthetic Lorenz-96 dynamics and real-world WeatherBench data, showing improved calibration, forecast accuracy, and reliability over misspecified baselines, with practical implications for data assimilation in complex dynamical systems.

Abstract

Data assimilation is a fundamental task in updating forecasting models upon observing new data, with applications ranging from weather prediction to online reinforcement learning. Deep generative forecasting models (DGFMs) have shown excellent performance in these areas, but assimilating data into such models is challenging due to their intractable likelihood functions. This limitation restricts the use of standard Bayesian data assimilation methodologies for DGFMs. To overcome this, we introduce prequential posteriors, based upon a predictive-sequential (prequential) loss function; an approach naturally suited for temporally dependent data which is the focus of forecasting tasks. Since the true data-generating process often lies outside the assumed model class, we adopt an alternative notion of consistency and prove that, under mild conditions, both the prequential loss minimizer and the prequential posterior concentrate around parameters with optimal predictive performance. For scalable inference, we employ easily parallelizable wastefree sequential Monte Carlo (SMC) samplers with preconditioned gradient-based kernels, enabling efficient exploration of high-dimensional parameter spaces such as those in DGFMs. We validate our method on both a synthetic multi-dimensional time series and a real-world meteorological dataset; highlighting its practical utility for data assimilation for complex dynamical systems.

Prequential posteriors

TL;DR

This work introduces prequential posteriors as a likelihood-free Bayesian framework for data assimilation in deep generative forecasting models, addressing the challenge of intractable likelihoods under temporal dependencies and model misspecification. By adopting a predictive-sequential loss and a Bernstein–von Mises-type analysis, the authors establish predictive consistency and posterior concentration around predictive-optimal parameters, even when the true data-generating process lies outside the model class. They implement a scalable wastefree Sequential Monte Carlo scheme with a preconditioned forward kernel to efficiently explore high-dimensional parameter spaces typical of DGFMs. The approach is validated on synthetic Lorenz-96 dynamics and real-world WeatherBench data, showing improved calibration, forecast accuracy, and reliability over misspecified baselines, with practical implications for data assimilation in complex dynamical systems.

Abstract

Data assimilation is a fundamental task in updating forecasting models upon observing new data, with applications ranging from weather prediction to online reinforcement learning. Deep generative forecasting models (DGFMs) have shown excellent performance in these areas, but assimilating data into such models is challenging due to their intractable likelihood functions. This limitation restricts the use of standard Bayesian data assimilation methodologies for DGFMs. To overcome this, we introduce prequential posteriors, based upon a predictive-sequential (prequential) loss function; an approach naturally suited for temporally dependent data which is the focus of forecasting tasks. Since the true data-generating process often lies outside the assumed model class, we adopt an alternative notion of consistency and prove that, under mild conditions, both the prequential loss minimizer and the prequential posterior concentrate around parameters with optimal predictive performance. For scalable inference, we employ easily parallelizable wastefree sequential Monte Carlo (SMC) samplers with preconditioned gradient-based kernels, enabling efficient exploration of high-dimensional parameter spaces such as those in DGFMs. We validate our method on both a synthetic multi-dimensional time series and a real-world meteorological dataset; highlighting its practical utility for data assimilation for complex dynamical systems.

Paper Structure

This paper contains 34 sections, 6 theorems, 52 equations, 6 figures, 1 table, 1 algorithm.

Key Result

Lemma 3

Under assumptions assumption1-assumption3, a uniform law of large numbers (ULLN) for martingales holds, which implies, with probability one under $P$,

Figures (6)

  • Figure 1: Illustration of the loss calculation. The loss function $\ell_T^{\theta}$ evaluates forecasts generated by the conditional generative network $Q_t^{\theta}$, which predicts $Y_t$ given the past observations $Y_{1:t-1}$; with respect to the observed $y_t$.
  • Figure 2: Posterior predictive performance across episodes for the Lorenz 96 model. The prequential posterior (orange curve) improves with additional data, while the misspecified EnKF (blue curve) shows little change, as measured by calibration error, normalised RMSE, and coefficient of determination $R^2$. The green dotted line represents the maximum achievable value for each metric. In the top panel, diagnostics are computed on the next episode (of length $\tau = 100$) of the same dataset, while the bottom panel uses a separate test dataset (of length $2000$).
  • Figure 3: Data assimilation for DGFM on the European subset of the WeatherBench dataset. We infer the prequential posteriors after each episode ($\sim 1$ year) and plot the diagnostics computed on a separate test data ($\sim 4$ years). The orange curve shows the value of the the calibration metrics (calibration error, NRMSE, $R^2$) of the posterior predictives obtained after each assimilation episode and the predictive performance of the posteriors improve steadily over each training episode. The green dotted line represents the maximum achievable value for each metric. Steady decay of the calibration error and NRMSE curves, along with the upward trend of the $R^2$ curve indicates effective data assimilation via posterior parameter updates in DGFM.
  • Figure 4: Comparison of target and simulations. The top left image shows an observation from the year $2000$, while the remaining five images are simulated from the posterior predictive distributions generated from the final posterior predictive distribution. The simulations roughly capture the underlying spatial pattern.
  • Figure 5: Predictive performance of the posterior predictives for the Lorenz 96 model under different SMC prior distributions. Calibration error, NRMSE, and $R^2$ are shown for three priors: a standard Gaussian (orange curve), a Student's $t$ distribution with $3$ degrees of freedom (green curve), and a Student's $t$ distribution with $5$ degrees of freedom (blue curve). All priors exhibit similar overall trends, but the Gaussian prior converges noticeably faster.
  • ...and 1 more figures

Theorems & Definitions (10)

  • Remark 1
  • Remark 2
  • Lemma 3
  • Corollary 4
  • Remark 5
  • Lemma 6
  • Remark 7
  • Theorem 8
  • Theorem 9
  • Theorem 10