Demonstrating CavePI: Autonomous Exploration of Underwater Caves by Semantic Guidance
Alankrit Gupta, Adnan Abdullah, Xianyao Li, Vaishnav Ramesh, Ioannis Rekleitis, Md Jahidul Islam
TL;DR
This work tackles GPS-denied autonomous underwater cave exploration by introducing CavePI, a compact AUV that navigates using semantic guidance from cavelines detected by a real-time, edge-optimized segmentation pipeline. The system combines lightweight perception on a Jetson Nano, a PID-based visual servoing controller, and a ROS2-based digital twin for pre-mission planning and testing, validated through extensive field trials in springs and caves. Key contributions include the CavePI hardware design, the semantically guided navigation pipeline, and a comprehensive evaluation framework spanning lab, simulation, and real-world deployments, with frank discussion of limitations and practical insights. The approach demonstrates robust autonomous navigation in challenging, feature-deprived environments and highlights paths toward more capable, scalable underwater cave exploration, including future compute upgrades and advanced SLAM-based perception and planning.
Abstract
Enabling autonomous robots to safely and efficiently navigate, explore, and map underwater caves is of significant importance to water resource management, hydrogeology, archaeology, and marine robotics. In this work, we demonstrate the system design and algorithmic integration of a visual servoing framework for semantically guided autonomous underwater cave exploration. We present the hardware and edge-AI design considerations to deploy this framework on a novel AUV (Autonomous Underwater Vehicle) named CavePI. The guided navigation is driven by a computationally light yet robust deep visual perception module, delivering a rich semantic understanding of the environment. Subsequently, a robust control mechanism enables CavePI to track the semantic guides and navigate within complex cave structures. We evaluate the system through field experiments in natural underwater caves and spring-water sites and further validate its ROS (Robot Operating System)-based digital twin in a simulation environment. Our results highlight how these integrated design choices facilitate reliable navigation under feature-deprived, GPS-denied, and low-visibility conditions.
