Table of Contents
Fetching ...

Beacon, a lightweight deep reinforcement learning benchmark library for flow control

Jonathan Viquerat, Philippe Meliga, Pablo Jeken, Elie Hachem

TL;DR

Beacon, an open-source benchmark library composed of seven lightweight one-dimensional and two-dimensional flow control problems with various characteristics, action and observation space characteristics, and CPU requirements, is proposed.

Abstract

Recently, the increasing use of deep reinforcement learning for flow control problems has led to a new area of research, focused on the coupling and the adaptation of the existing algorithms to the control of numerical fluid dynamics environments. Although still in its infancy, the field has seen multiple successes in a short time span, and its fast development pace can certainly be partly imparted to the open-source effort that drives the expansion of the community. Yet, this emerging domain still misses a common ground to (i) ensure the reproducibility of the results, and (ii) offer a proper ad-hoc benchmarking basis. To this end, we propose Beacon, an open-source benchmark library composed of seven lightweight 1D and 2D flow control problems with various characteristics, action and observation space characteristics, and CPU requirements. In this contribution, the seven considered problems are described, and reference control solutions are provided. The sources for the following work are available at https://github.com/jviquerat/beacon.

Beacon, a lightweight deep reinforcement learning benchmark library for flow control

TL;DR

Beacon, an open-source benchmark library composed of seven lightweight one-dimensional and two-dimensional flow control problems with various characteristics, action and observation space characteristics, and CPU requirements, is proposed.

Abstract

Recently, the increasing use of deep reinforcement learning for flow control problems has led to a new area of research, focused on the coupling and the adaptation of the existing algorithms to the control of numerical fluid dynamics environments. Although still in its infancy, the field has seen multiple successes in a short time span, and its fast development pace can certainly be partly imparted to the open-source effort that drives the expansion of the community. Yet, this emerging domain still misses a common ground to (i) ensure the reproducibility of the results, and (ii) offer a proper ad-hoc benchmarking basis. To this end, we propose Beacon, an open-source benchmark library composed of seven lightweight 1D and 2D flow control problems with various characteristics, action and observation space characteristics, and CPU requirements. In this contribution, the seven considered problems are described, and reference control solutions are provided. The sources for the following work are available at https://github.com/jviquerat/beacon.
Paper Structure (40 sections, 36 equations, 25 figures, 9 tables)

This paper contains 40 sections, 36 equations, 25 figures, 9 tables.

Figures (25)

  • Figure 1: Example of developed flow for the Shkadov equations with $\delta = 0.1$. Three regions can be identified: a first region where the instability grows from a white noise (blue), a second region with pseudo-periodic waves (orange), and a third region with non-periodic, pulse-like waves (green).
  • Figure 2: Score curves for the environment in different configurations. (Left) Comparison of score curves for ppo and td3 algorithms in the default configuration, using 5 jets. (Right) Comparison of different number of jets using the ppo algorithm. For each curve, we plot the average (solid color) and the standard deviation (shaded color) obtained from $n_\text{training} = 5$ different runs. The dashed line indicates the reward obtained for the uncontrolled environment.
  • Figure 3: Evolution of the flow under control of the agent, using 5 jets. The jets strengths are represented in the bottom rectangle (red means positive amplitude, blue means negative amplitude). The horizontal and vertical axes are the same as in figure \ref{['fig:shkadov_free']}.
  • Figure 4: Temperature and velocity profiles for the uncontrolled Rayleigh convection cell with $\text{Ra}=1.0e4$, $\text{Pr}=0.71$, $H=1$ and $L=1$.
  • Figure 5: Observation probes and actions imposition for theenvironment. The observations are collected at the probes regularly positioned in the domain, while the actions are imposed as piecewise-constant temperature boundary conditions on the bottom plate, with an average value equal to $\theta_H$.
  • ...and 20 more figures