Table of Contents
Fetching ...

Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning

Davide Corsi, Davide Camponogara, Alessandro Farinelli

TL;DR

The paper tackles the problem of training deep reinforcement learning (DRL) agents for aquatic navigation in non-stationary water environments, proposing a Unity3D-based simulator that supports both underwater and surface scenarios. It advances a PPO-based training pipeline augmented with curriculum learning and learnable hyperparameters, plus safety-oriented reward shaping, and provides an extensive set of ablations to establish baselines. Key contributions include a realistic, open-source aquatic benchmark, a configurable training pipeline with curriculum learning, dense rewards, and safety considerations, and validation on a photogrammetry-derived cave model of Porth Yr Ogof. Findings indicate that curriculum learning can improve safety and generalization, dense rewards are essential for convergence, and the combined method yields promising policies while underscoring remaining generalization and safety challenges. Overall, the work delivers a practical, reproducible platform to advance safe DRL for aquatic robotics and invites collaboration across researchers and applications.

Abstract

An exciting and promising frontier for Deep Reinforcement Learning (DRL) is its application to real-world robotic systems. While modern DRL approaches achieved remarkable successes in many robotic scenarios (including mobile robotics, surgical assistance, and autonomous driving) unpredictable and non-stationary environments can pose critical challenges to such methods. These features can significantly undermine fundamental requirements for a successful training process, such as the Markovian properties of the transition model. To address this challenge, we propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and DRL. In more detail, we show that our benchmarking environment is problematic even for state-of-the-art DRL approaches that may struggle to generate reliable policies in terms of generalization power and safety. Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques (such as curriculum learning and learnable hyperparameters). Our extensive empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results. Our simulation environment and training baselines are freely available to facilitate further research on this open problem and encourage collaboration in the field.

Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning

TL;DR

The paper tackles the problem of training deep reinforcement learning (DRL) agents for aquatic navigation in non-stationary water environments, proposing a Unity3D-based simulator that supports both underwater and surface scenarios. It advances a PPO-based training pipeline augmented with curriculum learning and learnable hyperparameters, plus safety-oriented reward shaping, and provides an extensive set of ablations to establish baselines. Key contributions include a realistic, open-source aquatic benchmark, a configurable training pipeline with curriculum learning, dense rewards, and safety considerations, and validation on a photogrammetry-derived cave model of Porth Yr Ogof. Findings indicate that curriculum learning can improve safety and generalization, dense rewards are essential for convergence, and the combined method yields promising policies while underscoring remaining generalization and safety challenges. Overall, the work delivers a practical, reproducible platform to advance safe DRL for aquatic robotics and invites collaboration across researchers and applications.

Abstract

An exciting and promising frontier for Deep Reinforcement Learning (DRL) is its application to real-world robotic systems. While modern DRL approaches achieved remarkable successes in many robotic scenarios (including mobile robotics, surgical assistance, and autonomous driving) unpredictable and non-stationary environments can pose critical challenges to such methods. These features can significantly undermine fundamental requirements for a successful training process, such as the Markovian properties of the transition model. To address this challenge, we propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and DRL. In more detail, we show that our benchmarking environment is problematic even for state-of-the-art DRL approaches that may struggle to generate reliable policies in terms of generalization power and safety. Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques (such as curriculum learning and learnable hyperparameters). Our extensive empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results. Our simulation environment and training baselines are freely available to facilitate further research on this open problem and encourage collaboration in the field.
Paper Structure (10 sections, 4 equations, 13 figures)

This paper contains 10 sections, 4 equations, 13 figures.

Figures (13)

  • Figure 1: The figures depict two environments within our simulator. The first figure shows our Autonomous Underwater Vehicle (AUV) navigating a 3D model of Porth yr Ogof marine cave, while the second figure shows our surface drone in one of the scenarios from our marine benchmark. Although the two environments differ in their objective, they share the main challenges introduced by the aquatic environment.
  • Figure 2: Viscous liquid.
  • Figure 3: Runny liquid.
  • Figure 4: Illustration of the influence that water exerts on the AUV as it attempts to follow an ideal path.
  • Figure 5: Comparison between curriculum learning and E2E.
  • ...and 8 more figures