Asynchronous Parallel Reinforcement Learning for Optimizing Propulsive Performance in Fin Ray Control

Xin-Yang Liu; Dariush Bodaghi; Qian Xue; Xudong Zheng; Jian-Xun Wang

Asynchronous Parallel Reinforcement Learning for Optimizing Propulsive Performance in Fin Ray Control

Xin-Yang Liu, Dariush Bodaghi, Qian Xue, Xudong Zheng, Jian-Xun Wang

Abstract

Fish fin rays constitute a sophisticated control system for ray-finned fish, facilitating versatile locomotion within complex fluid environments. Despite extensive research on the kinematics and hydrodynamics of fish locomotion, the intricate control strategies in fin-ray actuation remain largely unexplored. While deep reinforcement learning (DRL) has demonstrated potential in managing complex nonlinear dynamics; its trial-and-error nature limits its application to problems involving computationally demanding environmental interactions. This study introduces a cutting-edge off-policy DRL algorithm, interacting with a fluid-structure interaction (FSI) environment to acquire intricate fin-ray control strategies tailored for various propulsive performance objectives. To enhance training efficiency and enable scalable parallelism, an innovative asynchronous parallel training (APT) strategy is proposed, which fully decouples FSI environment interactions and policy/value network optimization. The results demonstrated the success of the proposed method in discovering optimal complex policies for fin-ray actuation control, resulting in a superior propulsive performance compared to the optimal sinusoidal actuation function identified through a parametric grid search. The merit and effectiveness of the APT approach are also showcased through comprehensive comparison with conventional DRL training strategies in numerical experiments of controlling nonlinear dynamics.

Asynchronous Parallel Reinforcement Learning for Optimizing Propulsive Performance in Fin Ray Control

Abstract

Paper Structure (22 sections, 22 equations, 12 figures, 2 algorithms)

This paper contains 22 sections, 22 equations, 12 figures, 2 algorithms.

Introduction
Methodology
The simulated FSI environment
Deep reinforcement learning
Enhancing RL Training Efficiency through Asynchronous Parallel Training
Numerical Experiments and Results
Problem formulation and DRL setting
Observation Space
Action space
Neural Network Architecture
Episode and control step
Baseline control method for comparative analysis
Maximize thrust
Maximize efficiency
Discussion
...and 7 more sections

Figures (12)

Figure 1: Schematics of (a) the fin-ray deformation with muscle actuation by applying offset of $\varepsilon$; (b) the fin-ray root motions of pitching, plunging, and muscle actuation
Figure 2: Boundary conditions of the flow solver and near body computational grids.
Figure 3: Time consumption schematics of 3 different RL training strategies
Figure 4: Illustration of the control Parameter and the observation space for the DRL agent. (a) depicts the observation vector of surrounding flow $\bm{o}_{flow}$, which includes the stream-wise velocity $u_x$ probed at the locations indicated by black dots (); (b) visualizes the observation vector of fin-ray deformation $\bm{o}_{fin}$, comprising the $y$-coordinates of eight equidistant points () along the fish fin ray.
Figure 5: Left panel: the actions $a_i$ () taken by the DRL agent, the corresponding root displacement $\varepsilon$ (), and the accumulated thrust $F_T$ () generated by the DRL-controlled fin ray. Right panel: the time series of thrust generated by the max-thrust DRL agent compared with that obtained by baseline method () during one episode.
...and 7 more figures

Asynchronous Parallel Reinforcement Learning for Optimizing Propulsive Performance in Fin Ray Control

Abstract

Asynchronous Parallel Reinforcement Learning for Optimizing Propulsive Performance in Fin Ray Control

Authors

Abstract

Table of Contents

Figures (12)