Table of Contents
Fetching ...

Interpretable and Efficient Data-driven Discovery and Control of Distributed Systems

Florian Wolf, Nicolò Botteghi, Urban Fasel, Andrea Manzoni

TL;DR

This work proposes a data-efficient, interpretable, and scalable Dyna-style model-based RL framework specifically tailored for PDE control, and integrates Sparse Identification of Nonlinear Dynamics with Control within an Autoencoder-based dimensionality reduction scheme for PDE states and actions (AE+SINDy-C).

Abstract

Effectively controlling systems governed by Partial Differential Equations (PDEs) is crucial in several fields of Applied Sciences and Engineering. These systems usually yield significant challenges to conventional control schemes due to their nonlinear dynamics, partial observability, high-dimensionality once discretized, distributed nature, and the requirement for low-latency feedback control. Reinforcement Learning (RL), particularly Deep RL (DRL), has recently emerged as a promising control paradigm for such systems, demonstrating exceptional capabilities in managing high-dimensional, nonlinear dynamics. However, DRL faces challenges including sample inefficiency, robustness issues, and an overall lack of interpretability. To address these issues, we propose a data-efficient, interpretable, and scalable Dyna-style Model-Based RL framework for PDE control, combining the Sparse Identification of Nonlinear Dynamics with Control (SINDy-C) algorithm and an autoencoder (AE) framework for the sake of dimensionality reduction of PDE states and actions. This novel approach enables fast rollouts, reducing the need for extensive environment interactions, and provides an interpretable latent space representation of the PDE forward dynamics. We validate our method on two PDE problems describing fluid flows - namely, the 1D Burgers equation and 2D Navier-Stokes equations - comparing it against a model-free baseline, and carrying out an extensive analysis of the learned dynamics.

Interpretable and Efficient Data-driven Discovery and Control of Distributed Systems

TL;DR

This work proposes a data-efficient, interpretable, and scalable Dyna-style model-based RL framework specifically tailored for PDE control, and integrates Sparse Identification of Nonlinear Dynamics with Control within an Autoencoder-based dimensionality reduction scheme for PDE states and actions (AE+SINDy-C).

Abstract

Effectively controlling systems governed by Partial Differential Equations (PDEs) is crucial in several fields of Applied Sciences and Engineering. These systems usually yield significant challenges to conventional control schemes due to their nonlinear dynamics, partial observability, high-dimensionality once discretized, distributed nature, and the requirement for low-latency feedback control. Reinforcement Learning (RL), particularly Deep RL (DRL), has recently emerged as a promising control paradigm for such systems, demonstrating exceptional capabilities in managing high-dimensional, nonlinear dynamics. However, DRL faces challenges including sample inefficiency, robustness issues, and an overall lack of interpretability. To address these issues, we propose a data-efficient, interpretable, and scalable Dyna-style Model-Based RL framework for PDE control, combining the Sparse Identification of Nonlinear Dynamics with Control (SINDy-C) algorithm and an autoencoder (AE) framework for the sake of dimensionality reduction of PDE states and actions. This novel approach enables fast rollouts, reducing the need for extensive environment interactions, and provides an interpretable latent space representation of the PDE forward dynamics. We validate our method on two PDE problems describing fluid flows - namely, the 1D Burgers equation and 2D Navier-Stokes equations - comparing it against a model-free baseline, and carrying out an extensive analysis of the learned dynamics.

Paper Structure

This paper contains 36 sections, 15 equations, 12 figures, 10 tables, 1 algorithm.

Figures (12)

  • Figure 1: A general overview of the RL training loop. In dyna-style algorithms we choose if the agent interacts with the full-order model, requiring (expensive) environment rollouts or the learned surrogate, i.e. reduced order, model, providing fast approximated rollouts. In this work, we focus on the setting where the full-order reward is (analytically) known and only the dynamics are approximated. In general, the observed state is computed by ${\mathbb{R}}^{{N_x^{\mathsf{Obs}}}} \ni \boldsymbol{\mathrm{x}}^{\mathsf{Obs}}_{t+1} = C \cdot \boldsymbol{\mathrm{x}}_{t+1}$. In the partially observable (PO) case the projection matrix $C \in \{0,1\}^{{N_x} \times {N_x^{\mathsf{Obs}}}}$ is structured with a single 1 per row and zero elsewhere, i.e. ${N_x^{\mathsf{Obs}}} \ll {N_x}$. In the fully observable case $C \equiv \mathrm{Id}_{{\mathbb{R}}^{N_x}}$, i.e. ${N_x^{\mathsf{Obs}}} = {N_x}$.
  • Figure 2: AE architecture and loss function used during the training stage. Trainable parameters are highlighted in red. The different stages of the training scheme can be listed as follows. (1) the current state $\boldsymbol{\mathrm{x}}_t$, applied control $\boldsymbol{\mathrm{u}}_t$, and the next state $\boldsymbol{\mathrm{x}}_{t+1}$ are provided as input data. (2) After compressing both the current state and the control vector, the SINDy-C algorithm is applied in the latent space, yielding a low-dimensional representation of the prediction for the next state. (3) The latent space representations of the current state, the control, and the next state prediction are decoded. (4) The classical AE loss and a regularization term to promote sparsity are computed. (5) The SINDy-C loss is computed. The figure is inspired by conti_reduced_2023.
  • Figure 3: Sample efficiency of the Dyna-style AE+SINDy-C method for the Burgers' equation. We test ${k_{\text{dyn}}} = 5, 10$ against the full-order baseline for the fully observable (solid line) and partially observable (dashed line). The dashed vertical lines indicate the point of early stopping for each of the model classes (FO + PO) after 100 epochs and represent the models which are evaluated in detail in \ref{['sec:RandomInitialConditionStateControlBurgers']}. For the evaluation the performance over five fixed random seeds is used.
  • Figure 4: State and control trajectories for the Burgers' equation in the partially observable (PO) case. The initial condition is a bell-shape hyperbolic cosine (\ref{['eq:BellShapeInitialCondition']} with $\alpha=0.5$ fixed), we use $\nu = 0.01$ (two orders of magnitude smaller compared to the training phase), and the black solid line indicates the timestep $t$ when the controller is activated.
  • Figure 5: State and control trajectories for the Burgers' equation in the fully observable (FO) case. The initial condition is a bell-shape hyperbolic cosine (\ref{['eq:BellShapeInitialCondition']} with $\alpha=0.5$ fixed), we use $\nu = 0.01$ (two orders of magnitude smaller compared to the training phase), and the black solid line indicates the timestep $t$ when the controller is activated.
  • ...and 7 more figures