Table of Contents
Fetching ...

Deep reinforcement learning based navigation of a jellyfish-like swimmer in flows with obstacles

Yihao Chen, Yue Yang

TL;DR

This work tackles obstacle-rich, wall-influenced fluid navigation by equipping a jellyfish-inspired swimmer with explicit real-time force and torque feedback within a soft-actor-critic DRL framework. By augmenting the state with hydrodynamic interactions and using offline CFD data from an immersed boundary method, the approach enables the agent to perceive boundary proximity through mechanical cues and to exploit wall effects for efficient turning. The results show faster, smoother maneuvers and improved obstacle avoidance compared to a force-free baseline, including a substantial gain in single-obstacle tasks and effective cave-exploration behavior with real-time path re-planning via A*. The study highlights force/torque feedback as a critical sensory modality for physics-aware autonomous underwater navigation with potential applications in cave mapping and robust near-wall operation.

Abstract

We develop a deep reinforcement learning framework for controlling a bio-inspired jellyfish swimmer to navigate complex fluid environments with obstacles. While existing methods often rely on kinematic and geometric states, a key challenge remains in achieving efficient obstacle avoidance under strong fluid-structure interactions and near-wall effects. We augment the agent's state representation within a soft actor-critic algorithm to include the real-time forces and torque experienced by the swimmer, providing direct mechanical feedback from vortex-wall interactions. This augmented state space enables the swimmer to perceive and interpret wall proximity and orientation through distinct hydrodynamic force signatures. We analyze how these force and torque patterns, generated by walls at different positions influence the swimmer's decision-making policy. Comparative experiments with a baseline model without force feedback demonstrate that the present one with force feedback achieves higher navigation efficiency in two-dimensional obstacle-avoidance tasks. The results show that explicit force feedback facilitates earlier, smoother maneuvers and enables the exploitation of wall effects for efficient turning behaviors. With an application to autonomous cave mapping, this work underscores the critical role of direct mechanical feedback in fluid environments and presents a physics-aware machine learning framework for advancing robust underwater exploration systems.

Deep reinforcement learning based navigation of a jellyfish-like swimmer in flows with obstacles

TL;DR

This work tackles obstacle-rich, wall-influenced fluid navigation by equipping a jellyfish-inspired swimmer with explicit real-time force and torque feedback within a soft-actor-critic DRL framework. By augmenting the state with hydrodynamic interactions and using offline CFD data from an immersed boundary method, the approach enables the agent to perceive boundary proximity through mechanical cues and to exploit wall effects for efficient turning. The results show faster, smoother maneuvers and improved obstacle avoidance compared to a force-free baseline, including a substantial gain in single-obstacle tasks and effective cave-exploration behavior with real-time path re-planning via A*. The study highlights force/torque feedback as a critical sensory modality for physics-aware autonomous underwater navigation with potential applications in cave mapping and robust near-wall operation.

Abstract

We develop a deep reinforcement learning framework for controlling a bio-inspired jellyfish swimmer to navigate complex fluid environments with obstacles. While existing methods often rely on kinematic and geometric states, a key challenge remains in achieving efficient obstacle avoidance under strong fluid-structure interactions and near-wall effects. We augment the agent's state representation within a soft actor-critic algorithm to include the real-time forces and torque experienced by the swimmer, providing direct mechanical feedback from vortex-wall interactions. This augmented state space enables the swimmer to perceive and interpret wall proximity and orientation through distinct hydrodynamic force signatures. We analyze how these force and torque patterns, generated by walls at different positions influence the swimmer's decision-making policy. Comparative experiments with a baseline model without force feedback demonstrate that the present one with force feedback achieves higher navigation efficiency in two-dimensional obstacle-avoidance tasks. The results show that explicit force feedback facilitates earlier, smoother maneuvers and enables the exploitation of wall effects for efficient turning behaviors. With an application to autonomous cave mapping, this work underscores the critical role of direct mechanical feedback in fluid environments and presents a physics-aware machine learning framework for advancing robust underwater exploration systems.

Paper Structure

This paper contains 15 sections, 10 equations, 14 figures, 1 table, 2 algorithms.

Figures (14)

  • Figure 1: Diagram for the overall workflow. (a) Data obtained from multiple simulations are used for offline training. (b) Action space with four actions $A_i$, $i=0,1,2,3$, representing typical jellyfish actions (from left to right): symmetric forces on the two sides, larger force on the right side, larger force on the left side, and no force. (c) Geometry and state of the jellyfish-like swimmer. The red parts indicate where the forces are applied. (d) The swimmer choosing the moving forward action deviates with the existence of a side wall. (e) The swimmer with SAC agent on a simple obstacle avoidance task. (f) A pathfinding algorithm is performed in descretized domain based on detected boundaries. (g) The SAC module receives the state vector and outputs a probability distribution for the actions. The action is chosen with this distribution. (h) The swimmer senses the environment and uses the detected boundaries (in blue) to find a path in a map like (f). A pilot point is then calculated to guide the swimmer toward the target.
  • Figure 2: (a) Trajectory of the swimmer's forward motion with surrouding vorticity where the action $A_0$ is always chosen, with a side wall, where $\varphi$ and $d_r$ are the deflection angle of the swimmer's symmetric axis and distance from the swimmer's mass center to wall, respectively. (b) Angle deviation of swimmer swimming forward over duration of $t/T=10$ with different Reynolds numbers and initial distances to the side wall.
  • Figure 3: Trajectory and vorticity field of the swimmer forward motion with a side wall at $d_w/d_0=2.5$ over duration of $t/T=10$ for $Re=10$ (blue dashed line) and $100$ (red solid line).
  • Figure 4: Trajectory (red line) of the swimmer's forward motion over duration of $t/T=10$ at $Re=1000$, along with surrounding vorticity. (a) $d_w/d_0=1.5$. (b) $d_w/d_0=2$. The existence of the wall with different initial wall distances interacts differently with the shedding vortex from the swimmer and influences the deflection angle. Supplementary movie 1 illustrates the comparision between the two scenarios.
  • Figure 5: Time evolution of (a) the swimmer's force magnitude, (b) force direction angle, and (c) torque, which are normalized by their wall-free values. (d) Corresponding displacement in the $x$-direction. All simulations are conducted at $Re=100$ with various $d_w/d_0$.
  • ...and 9 more figures