Table of Contents
Fetching ...

Demonstrating CavePI: Autonomous Exploration of Underwater Caves by Semantic Guidance

Alankrit Gupta, Adnan Abdullah, Xianyao Li, Vaishnav Ramesh, Ioannis Rekleitis, Md Jahidul Islam

TL;DR

This work tackles GPS-denied autonomous underwater cave exploration by introducing CavePI, a compact AUV that navigates using semantic guidance from cavelines detected by a real-time, edge-optimized segmentation pipeline. The system combines lightweight perception on a Jetson Nano, a PID-based visual servoing controller, and a ROS2-based digital twin for pre-mission planning and testing, validated through extensive field trials in springs and caves. Key contributions include the CavePI hardware design, the semantically guided navigation pipeline, and a comprehensive evaluation framework spanning lab, simulation, and real-world deployments, with frank discussion of limitations and practical insights. The approach demonstrates robust autonomous navigation in challenging, feature-deprived environments and highlights paths toward more capable, scalable underwater cave exploration, including future compute upgrades and advanced SLAM-based perception and planning.

Abstract

Enabling autonomous robots to safely and efficiently navigate, explore, and map underwater caves is of significant importance to water resource management, hydrogeology, archaeology, and marine robotics. In this work, we demonstrate the system design and algorithmic integration of a visual servoing framework for semantically guided autonomous underwater cave exploration. We present the hardware and edge-AI design considerations to deploy this framework on a novel AUV (Autonomous Underwater Vehicle) named CavePI. The guided navigation is driven by a computationally light yet robust deep visual perception module, delivering a rich semantic understanding of the environment. Subsequently, a robust control mechanism enables CavePI to track the semantic guides and navigate within complex cave structures. We evaluate the system through field experiments in natural underwater caves and spring-water sites and further validate its ROS (Robot Operating System)-based digital twin in a simulation environment. Our results highlight how these integrated design choices facilitate reliable navigation under feature-deprived, GPS-denied, and low-visibility conditions.

Demonstrating CavePI: Autonomous Exploration of Underwater Caves by Semantic Guidance

TL;DR

This work tackles GPS-denied autonomous underwater cave exploration by introducing CavePI, a compact AUV that navigates using semantic guidance from cavelines detected by a real-time, edge-optimized segmentation pipeline. The system combines lightweight perception on a Jetson Nano, a PID-based visual servoing controller, and a ROS2-based digital twin for pre-mission planning and testing, validated through extensive field trials in springs and caves. Key contributions include the CavePI hardware design, the semantically guided navigation pipeline, and a comprehensive evaluation framework spanning lab, simulation, and real-world deployments, with frank discussion of limitations and practical insights. The approach demonstrates robust autonomous navigation in challenging, feature-deprived environments and highlights paths toward more capable, scalable underwater cave exploration, including future compute upgrades and advanced SLAM-based perception and planning.

Abstract

Enabling autonomous robots to safely and efficiently navigate, explore, and map underwater caves is of significant importance to water resource management, hydrogeology, archaeology, and marine robotics. In this work, we demonstrate the system design and algorithmic integration of a visual servoing framework for semantically guided autonomous underwater cave exploration. We present the hardware and edge-AI design considerations to deploy this framework on a novel AUV (Autonomous Underwater Vehicle) named CavePI. The guided navigation is driven by a computationally light yet robust deep visual perception module, delivering a rich semantic understanding of the environment. Subsequently, a robust control mechanism enables CavePI to track the semantic guides and navigate within complex cave structures. We evaluate the system through field experiments in natural underwater caves and spring-water sites and further validate its ROS (Robot Operating System)-based digital twin in a simulation environment. Our results highlight how these integrated design choices facilitate reliable navigation under feature-deprived, GPS-denied, and low-visibility conditions.

Paper Structure

This paper contains 26 sections, 9 equations, 17 figures, 4 tables, 1 algorithm.

Figures (17)

  • Figure 1: The CavePI AUV navigates by leveraging the semantic guidance of a caveline from its down-facing camera. A deep visual perception module extracts the semantic cues, which are processed by an onboard planner to make visual servoing decisions.
  • Figure 2: The proposed CavePI system design is shown; (a) isometric 3D view of the robot; (b) side-view and top-view displaying the outer shell, sonar, and thrusters' positions; (c) cross-sectional view presenting the assembly of the electronic components inside the computational enclosure; (d) the fully assembled system. CavePI is one-person deployable, weighs $8.8$ kg, and has a depth rating of $65$ meters ($213$ ft).
  • Figure 3: Major electronics and sensor-actuator connections of CavePI.
  • Figure 4: Data flow among major computational modules of CavePI is shown in the form of ROS Topics: red and blue arrows represent subscribed and published topics in the ROS graph, respectively.
  • Figure 5: Simplified model architecture for caveline segmentation is shown; we use a DeepLabV3 chen2017deeplab head with two choices for backbone network: MobileNetV3 howard2019searching and ResNet101 he2016deep.
  • ...and 12 more figures