An Adaptive Online Smoother with Closed-Form Solutions and Information-Theoretic Lag Selection for Conditional Gaussian Nonlinear Systems
Marios Andreou, Nan Chen, Yingda Li
TL;DR
The paper tackles state estimation for complex turbulent systems by developing an adaptive-online smoother within the conditional Gaussian nonlinear system (CGNS) framework, achieving closed-form forward–backward smoothing updates with no empirical tuning. An information-theoretic criterion is introduced to determine adaptive lags, balancing posterior uncertainty reduction against storage and computation, and enabling lag adjustments that respond to intermittency and extreme events. The methodology yields exact Gaussian posteriors for both filtering and smoothing, along with discrete-time online updates and a fixed-lag baseline for comparison, and is supported by theoretical analysis of update matrices and spectral properties. The adaptive-lag approach is demonstrated through three applications: online causality detection in a nonlinear dyad, high-dimensional Lagrangian data assimilation, and online parameter estimation with partial observations, including an online EM algorithm that benefits from the smoother’s analytic formulas. Collectively, the work offers a principled, scalable, and storage-efficient framework for real-time state estimation, causal inference, and online learning in CGNS-driven problems with strong nonlinear and non-Gaussian features.
Abstract
Data assimilation (DA) combines partial observations with dynamical models to improve state estimation. Filter-based DA uses only past and present data and is the prerequisite for real-time forecasts. Smoother-based DA exploits both past and future observations. It aims to fill in missing data, provide more accurate estimations, and develop high-quality datasets. However, the standard smoothing procedure requires using all historical state estimations, which is storage-demanding, especially for high-dimensional systems. This paper develops an adaptive-lag online smoother for a large class of complex dynamical systems with strong nonlinear and non-Gaussian features, which has important applications to many real-world problems. The adaptive lag allows the utilization of observations only within a nearby window, thus reducing computational complexity and storage needs. Online lag adjustment is essential for tackling turbulent systems, where temporal autocorrelation varies significantly over time due to intermittency, extreme events, and nonlinearity. Based on the uncertainty reduction in the estimated state, an information criterion is developed to systematically determine the adaptive lag. Notably, the mathematical structure of these systems facilitates the use of closed analytic formulae to calculate the online smoother and adaptive lag, avoiding empirical tunings as in ensemble-based DA methods. The adaptive online smoother is applied to studying three important scientific problems. First, it helps detect online causal relationships between state variables. Second, the advantage of reduced computational storage expenditure is illustrated via Lagrangian DA, a high-dimensional nonlinear problem. Finally, the adaptive smoother advances online parameter estimation with partial observations, emphasizing the role of the observed extreme events in accelerating convergence.
