EgoExo++: Integrating On-demand Exocentric Visuals with 2.5D Ground Surface Estimation for Interactive Teleoperation of Subsea ROVs
Adnan Abdullah, Ruo Chen, Ioannis Rekleitis, Md Jahidul Islam
TL;DR
The paper addresses the limited situational awareness in egocentric underwater ROV teleoperation by introducing EgoExo++, a geometry-driven, real-time pipeline that synthesizes on-demand exocentric views and 2.5D ground textures from monocular SLAM data. It integrates exocentric view synthesis and ROV rendering into a SLAM backbone, enabling 2D and 2.5D perspectives (EgoExo and EgoExo++) that provide enhanced peripheral contextual information. The approach is validated through 2D indoor experiments and 3D underwater cave field trials, complemented by a user study indicating improved operator awareness, safety, and efficiency, and demonstrates potential for shared autonomy and training applications. The work highlights practical benefits for deep subsea missions while outlining future directions toward multi-sensor SLAM backbones and richer interactive and simulation capabilities.
Abstract
Underwater ROVs (Remotely Operated Vehicles) are indispensable for subsea exploration and task execution, yet typical teleoperation engines based on egocentric (first-person) video feeds restrict human operators' field-of-view and limit precise maneuvering in complex, unstructured underwater environments. To address this, we propose EgoExo, a geometry-driven solution integrated into a visual SLAM pipeline that synthesizes on-demand exocentric (third-person) views from egocentric camera feeds. Our proposed framework, EgoExo++, extends beyond 2D exocentric view synthesis (EgoExo) to augment a dense 2.5D ground surface estimation on-the-fly. It simultaneously renders the ROV model onto this reconstructed surface, enhancing semantic perception and depth comprehension. The computations involved are closed-form and rely solely on egocentric views and monocular SLAM estimates, which makes it portable across existing teleoperation engines and robust to varying waterbody characteristics. We validate the geometric accuracy of our approach through extensive experiments of 2-DOF indoor navigation and 6-DOF underwater cave exploration in challenging low-light conditions. Quantitative metrics confirm the reliability of the rendered Exo views, while a user study involving 15 operators demonstrates improved situational awareness, navigation safety, and task efficiency during teleoperation. Furthermore, we highlight the role of EgoExo++ augmented visuals in supporting shared autonomy, operator training, and embodied teleoperation. This new interactive approach to ROV teleoperation presents promising opportunities for future research in subsea telerobotics.
