Table of Contents
Fetching ...

Physics-Guided Actor-Critic Reinforcement Learning for Swimming in Turbulence

Christopher Koh, Laurent Pagnier, Michael Chertkov

Abstract

Turbulent diffusion causes particles placed in proximity to separate. We investigate the required swimming efforts to maintain an active particle close to its passively advected counterpart. We explore optimally balancing these efforts by developing a novel physics-informed reinforcement learning strategy and comparing it with prescribed control and physics-agnostic reinforcement learning strategies. Our scheme, coined the actor-physicist, is an adaptation of the actor-critic algorithm in which the neural network parameterized critic is replaced with an analytically derived physical heuristic function, the physicist. We validate the proposed physics-informed reinforcement learning approach through extensive numerical experiments in both synthetic BK and more realistic Arnold-Beltrami-Childress flow environments, demonstrating its superiority in controlling particle dynamics when compared to standard reinforcement learning methods.

Physics-Guided Actor-Critic Reinforcement Learning for Swimming in Turbulence

Abstract

Turbulent diffusion causes particles placed in proximity to separate. We investigate the required swimming efforts to maintain an active particle close to its passively advected counterpart. We explore optimally balancing these efforts by developing a novel physics-informed reinforcement learning strategy and comparing it with prescribed control and physics-agnostic reinforcement learning strategies. Our scheme, coined the actor-physicist, is an adaptation of the actor-critic algorithm in which the neural network parameterized critic is replaced with an analytically derived physical heuristic function, the physicist. We validate the proposed physics-informed reinforcement learning approach through extensive numerical experiments in both synthetic BK and more realistic Arnold-Beltrami-Childress flow environments, demonstrating its superiority in controlling particle dynamics when compared to standard reinforcement learning methods.
Paper Structure (15 sections, 28 equations, 10 figures, 2 tables)

This paper contains 15 sections, 28 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Actor-Physicist (AP) diagram. The main idea of the present works is to substitute an expression obtained from the system's dynamics in place of the standard neural network.
  • Figure 2: Flowchart explaining the relations between RL, environment, theory, and the components of the different control schemes.
  • Figure 3: Sample trajectories of an active particle (gray) and its passive target (red) in an ABC flow. (a) No control: trajectories starting from the same point diverge chaotically. (b) The case of PC control: the active particle closely follows the trajectory of the target particle. While trajectories are not identical, their divergences from the passive target is far less than in the uncontrolled case.
  • Figure 4: Finite time statistics of the leading Lyapunov exponent $\lambda_1(t)\approx\log(\bm W^\top(t;0)\bm W(t;0))/t$ in the ABC flow, evaluated at $t$ equal to the episode's duration.
  • Figure 5: Distribution of separation $\bm{s}(t)$ for prescribed control with fixed values of $\phi$: $0.6$ (blue), $1.1$ (red), and $1.6$ (green) for B-K flows (left column) and ABC flows (right column). The lines represent the predictions of Eq. \ref{['eq:s_dist']}. For the ABC flow, $\bar{\lambda}_1$ was obtained from simulations, as shown in Fig. \ref{['fig:ABC']}.
  • ...and 5 more figures