Data Assimilation in Chaotic Systems Using Deep Reinforcement Learning

Mohamad Abed El Rahman Hammoud; Naila Raboudi; Edriss S. Titi; Omar Knio; Ibrahim Hoteit

Data Assimilation in Chaotic Systems Using Deep Reinforcement Learning

Mohamad Abed El Rahman Hammoud, Naila Raboudi, Edriss S. Titi, Omar Knio, Ibrahim Hoteit

TL;DR

This work introduces a reinforcement-learning–driven data assimilation (RL-DA) framework to apply nonlinear, state-adaptive corrections to chaotic system forecasts using observations. Framed as an MDP, the agent employs a stochastic policy via Proximal Policy Optimization to generate an ensemble of assimilated states for the Lorenz '63 system, without assuming Gaussian observation or model errors. Across extensive experiments, RL-DA outperforms the ensemble Kalman filter (EnKF) in scenarios with non-Gaussian noise and partial observability, and achieves competitive or superior accuracy when averaging over stochastic policy realizations. The approach offers a flexible, data-driven alternative to traditional DA methods with potential applicability to complex, nonlinear, and non-Gaussian environments in geosciences and engineering, while raising questions about reward design and physical consistency.

Abstract

Data assimilation (DA) plays a pivotal role in diverse applications, ranging from climate predictions and weather forecasts to trajectory planning for autonomous vehicles. A prime example is the widely used ensemble Kalman filter (EnKF), which relies on linear updates to minimize variance among the ensemble of forecast states. Recent advancements have seen the emergence of deep learning approaches in this domain, primarily within a supervised learning framework. However, the adaptability of such models to untrained scenarios remains a challenge. In this study, we introduce a novel DA strategy that utilizes reinforcement learning (RL) to apply state corrections using full or partial observations of the state variables. Our investigation focuses on demonstrating this approach to the chaotic Lorenz '63 system, where the agent's objective is to minimize the root-mean-squared error between the observations and corresponding forecast states. Consequently, the agent develops a correction strategy, enhancing model forecasts based on available system state observations. Our strategy employs a stochastic action policy, enabling a Monte Carlo-based DA framework that relies on randomly sampling the policy to generate an ensemble of assimilated realizations. Results demonstrate that the developed RL algorithm performs favorably when compared to the EnKF. Additionally, we illustrate the agent's capability to assimilate non-Gaussian data, addressing a significant limitation of the EnKF.

Data Assimilation in Chaotic Systems Using Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure (17 sections, 11 equations, 4 figures)

This paper contains 17 sections, 11 equations, 4 figures.

Introduction
Reinforcement Learning For Data Assimilation
Methods
Reinforcement Learning
Proximal Policy Optimization
Lorenz '63
Data assimilation using Reinforcement Learning
Training the DA agent
Ensemble Kalman Filter
Tracking Reference Solutions
Assimilating Noisy Observations
Noise Level
Assimilation Frequency
Noise Distribution
Partial Observability
...and 2 more sections

Figures (4)

Figure 1: Schematic of the proposed reinforcement learning-based data assimilation framework using the Lorenz '63 as the main example. The plot illustrates the Lorenz '63 solution trajectory (black curve) with an arbitrary assimilation window start time $t$ (red triangle) and corresponding end time $t+\delta t^o$ (green square) when a new observation is available and assimilated. The three dimensional state variables ($\bm{x}$) of the model are shown at every model time step $\delta t$ (blue circles). At the last time step, the noisy observational data point ($\bm{y}$) is shown (inverted purple triangle) alongside the different evolution trajectories (orange curves) following several corrections ($\mathcal{F}(s_t)$) sampled from the policy function $\pi_{\theta}(a_t | s_t)$. The policy $\pi_{\theta}(a_t | s_t)$ considers as input state vector the extended state vector composed of the concatenation of the forecast state variables ($\bm{x}$) and their time derivatives ($\dot{\bm{x}}$) at each time step $\delta t$ between $t$ and $t + \delta t^o$ alongside the innovation term, defined as the difference between the observation and its correspondent forecast. The concatenation operation is denoted by $\oplus$, and for the sake of conciseness, concatenation of $\bm{x}$ and $\dot{\bm{x}}$ at each $\delta t$ is represented by the sub- and super-scripts of $\left[ \bm{x},\dot{\bm{x}} \right]$. Since a stochastic policy is considered in the DA framework, an ensemble of $\mathcal{F}(s_t)$ correction terms are sampled from $\pi_{\theta}(a_t | s_t)$ when a noisy observation is available. Note that the state variables might not be fully observed, hence $\mathcal{H}$ projects the forecast onto the observation space. Moreover, the observation $\bm{y}$ is considered to be a noisy estimate of the forecast with no restriction on the distribution of the additive noise.
Figure 2: Evolution of the mean RMSE (solid lines) and its $\pm \sigma$ (shadowed) based on 50 experiment repetitions. Plotted are results for different experiments (a)-(c) tracking a noise-free reference solution, and for assimilating noisy observations in the case of (d)-(f) varying noise levels using normally-distributed noise, (e)-(i) different assimilation window lengths, (j)-(l) different noise distributions and (m)-(o) partial observability. The captions beneath each subplot describes the experimental condition in the order of noise distribution, $\delta t^o/\delta t$ the observation frequency and $\mathcal{H}$ the observation operator.
Figure 3: Evolution of the $z$-variable for a sample RL solution (solid blue lines) and corresponding reference (dashed red line). Plotted are results for different experiments (a)-(c) tracking a noise-free reference solution, and for assimilating noisy observations in the case of (d)-(f) varying noise levels using normally-distributed noise, (e)-(i) different assimilation window lengths, (j)-(l) different noise distributions and (m)-(o) partial observability. The captions beneath each subplot describes the experimental condition in the order of noise distribution, $\mathcal{T}$ the observation frequency and $\mathcal{H}$ the observation operator.
Figure 4: PDFs of the $z$-variable before (top) and after (middle) the correction step at time $t=45$ alongside the PDF of the correction (bottom) for the EnKF and RL solutions. The plots are presented for the experiment analyzing the sensitivity of the data assimilation algorithms to noise level.

Data Assimilation in Chaotic Systems Using Deep Reinforcement Learning

TL;DR

Abstract

Data Assimilation in Chaotic Systems Using Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)