Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies

Nicolò Botteghi; Urban Fasel

Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies

Nicolò Botteghi, Urban Fasel

TL;DR

This work leverages dictionary learning and differentiable L$_0$ regularization to learn sparse, robust, and interpretable control policies for parametric PDEs and shows that this method outperforms baseline DNN-based DRL policies, allows for the derivation of interpretable equations of the learned optimal control laws, and generalizes to unseen parameters of the PDE without retraining the policies.

Abstract

Optimal control of parametric partial differential equations (PDEs) is crucial in many applications in engineering and science. In recent years, the progress in scientific machine learning has opened up new frontiers for the control of parametric PDEs. In particular, deep reinforcement learning (DRL) has the potential to solve high-dimensional and complex control problems in a large variety of applications. Most DRL methods rely on deep neural network (DNN) control policies. However, for many dynamical systems, DNN-based control policies tend to be over-parametrized, which means they need large amounts of training data, show limited robustness, and lack interpretability. In this work, we leverage dictionary learning and differentiable L$_0$ regularization to learn sparse, robust, and interpretable control policies for parametric PDEs. Our sparse policy architecture is agnostic to the DRL method and can be used in different policy-gradient and actor-critic DRL algorithms without changing their policy-optimization procedure. We test our approach on the challenging tasks of controlling parametric Kuramoto-Sivashinsky and convection-diffusion-reaction PDEs. We show that our method (1) outperforms baseline DNN-based DRL policies, (2) allows for the derivation of interpretable equations of the learned optimal control laws, and (3) generalizes to unseen parameters of the PDE without retraining the policies.

Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies

TL;DR

This work leverages dictionary learning and differentiable L

regularization to learn sparse, robust, and interpretable control policies for parametric PDEs and shows that this method outperforms baseline DNN-based DRL policies, allows for the derivation of interpretable equations of the learned optimal control laws, and generalizes to unseen parameters of the PDE without retraining the policies.

Abstract

regularization to learn sparse, robust, and interpretable control policies for parametric PDEs. Our sparse policy architecture is agnostic to the DRL method and can be used in different policy-gradient and actor-critic DRL algorithms without changing their policy-optimization procedure. We test our approach on the challenging tasks of controlling parametric Kuramoto-Sivashinsky and convection-diffusion-reaction PDEs. We show that our method (1) outperforms baseline DNN-based DRL policies, (2) allows for the derivation of interpretable equations of the learned optimal control laws, and (3) generalizes to unseen parameters of the PDE without retraining the policies.

Paper Structure (22 sections, 20 equations, 7 figures, 3 tables, 2 algorithms)

This paper contains 22 sections, 20 equations, 7 figures, 3 tables, 2 algorithms.

Introduction
Related Work
Preliminaries
Reinforcement Learning
Twin-Delayed Deep Deterministic Policy Gradient
Sparse Dictionary Learning
Sparsifying Neural Network Layers with L$_0$ Regularization
Methodology
Problem Settings
Deep Reinforcement Learning with L$_0$-Sparse Polynomial Policies
TD3 Gradient with L$_0$-Sparse Polynomial Policy
Results
Kuramoto-Sivashinsky PDE
Convection-Diffusion-Reaction PDE
Discussion and Conclusion
...and 7 more sections

Figures (7)

Figure 1: Deep reinforcement learning with L$_0$-sparse polynomial policies.
Figure 2: Example of solutions of the Kuramoto-Sivashinsky PDE for different values of the parameter $\mu$.
Figure 3: Training and evaluation (with and without noise on the measurements) results of the Kuramoto-Sivashinsky PDE control problem. The plots show mean (solid line) and standard deviation (shaded area) of four different random seeds ($1, 7, 92, 256$).
Figure 4: Optimal control policy performance on unseen instances of the parameter $\mu$, i.e., a) $\mu=0.121$ (interpolation), and b) $\mu=0.225$ (extrapolation). Additionally, for each method we report the state cost $c_1$ and the scaled action cost $\alpha c_2$.
Figure 5: Example of solutions of the Convection-Reaction-Diffusion PDE for different values of the parameter vector $\mu=[\nu, c, r]$.
...and 2 more figures

Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies

TL;DR

Abstract

Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies

Authors

TL;DR

Abstract

Table of Contents

Figures (7)