Physics-Guided Actor-Critic Reinforcement Learning for Swimming in Turbulence

Christopher Koh; Laurent Pagnier; Michael Chertkov

Physics-Guided Actor-Critic Reinforcement Learning for Swimming in Turbulence

Christopher Koh, Laurent Pagnier, Michael Chertkov

Abstract

Turbulent diffusion causes particles placed in proximity to separate. We investigate the required swimming efforts to maintain an active particle close to its passively advected counterpart. We explore optimally balancing these efforts by developing a novel physics-informed reinforcement learning strategy and comparing it with prescribed control and physics-agnostic reinforcement learning strategies. Our scheme, coined the actor-physicist, is an adaptation of the actor-critic algorithm in which the neural network parameterized critic is replaced with an analytically derived physical heuristic function, the physicist. We validate the proposed physics-informed reinforcement learning approach through extensive numerical experiments in both synthetic BK and more realistic Arnold-Beltrami-Childress flow environments, demonstrating its superiority in controlling particle dynamics when compared to standard reinforcement learning methods.

Physics-Guided Actor-Critic Reinforcement Learning for Swimming in Turbulence

Abstract

Paper Structure (15 sections, 28 equations, 10 figures, 2 tables)

This paper contains 15 sections, 28 equations, 10 figures, 2 tables.

Introduction
Reinforcement Learning
Actor-Critic Reinforcement Learning with Physics Informed Critic
Preliminaries: Prescribed and Optimal Control of Swimming in Turbulence
Two Particles in a Chaotic Flow
Batchelor Flow under Proportional Control
BK Model under Proportional Control
Optimal Stationary Control of BK flow
State Value function in the Batchelo-Kraichnan Flow
Swimming in Arnold–Beltrami–Childress Flow
Numerical Experiments
Validation of the Physicist (Baseline)
Comparison with Standard Actor-Critic Methods
Comparison with Prescribed Control
Conclusion

Figures (10)

Figure 1: Actor-Physicist (AP) diagram. The main idea of the present works is to substitute an expression obtained from the system's dynamics in place of the standard neural network.
Figure 2: Flowchart explaining the relations between RL, environment, theory, and the components of the different control schemes.
Figure 3: Sample trajectories of an active particle (gray) and its passive target (red) in an ABC flow. (a) No control: trajectories starting from the same point diverge chaotically. (b) The case of PC control: the active particle closely follows the trajectory of the target particle. While trajectories are not identical, their divergences from the passive target is far less than in the uncontrolled case.
Figure 4: Finite time statistics of the leading Lyapunov exponent $\lambda_1(t)\approx\log(\bm W^\top(t;0)\bm W(t;0))/t$ in the ABC flow, evaluated at $t$ equal to the episode's duration.
Figure 5: Distribution of separation $\bm{s}(t)$ for prescribed control with fixed values of $\phi$: $0.6$ (blue), $1.1$ (red), and $1.6$ (green) for B-K flows (left column) and ABC flows (right column). The lines represent the predictions of Eq. \ref{['eq:s_dist']}. For the ABC flow, $\bar{\lambda}_1$ was obtained from simulations, as shown in Fig. \ref{['fig:ABC']}.
...and 5 more figures

Physics-Guided Actor-Critic Reinforcement Learning for Swimming in Turbulence

Abstract

Physics-Guided Actor-Critic Reinforcement Learning for Swimming in Turbulence

Authors

Abstract

Table of Contents

Figures (10)