Table of Contents
Fetching ...

EgoExo++: Integrating On-demand Exocentric Visuals with 2.5D Ground Surface Estimation for Interactive Teleoperation of Subsea ROVs

Adnan Abdullah, Ruo Chen, Ioannis Rekleitis, Md Jahidul Islam

TL;DR

The paper addresses the limited situational awareness in egocentric underwater ROV teleoperation by introducing EgoExo++, a geometry-driven, real-time pipeline that synthesizes on-demand exocentric views and 2.5D ground textures from monocular SLAM data. It integrates exocentric view synthesis and ROV rendering into a SLAM backbone, enabling 2D and 2.5D perspectives (EgoExo and EgoExo++) that provide enhanced peripheral contextual information. The approach is validated through 2D indoor experiments and 3D underwater cave field trials, complemented by a user study indicating improved operator awareness, safety, and efficiency, and demonstrates potential for shared autonomy and training applications. The work highlights practical benefits for deep subsea missions while outlining future directions toward multi-sensor SLAM backbones and richer interactive and simulation capabilities.

Abstract

Underwater ROVs (Remotely Operated Vehicles) are indispensable for subsea exploration and task execution, yet typical teleoperation engines based on egocentric (first-person) video feeds restrict human operators' field-of-view and limit precise maneuvering in complex, unstructured underwater environments. To address this, we propose EgoExo, a geometry-driven solution integrated into a visual SLAM pipeline that synthesizes on-demand exocentric (third-person) views from egocentric camera feeds. Our proposed framework, EgoExo++, extends beyond 2D exocentric view synthesis (EgoExo) to augment a dense 2.5D ground surface estimation on-the-fly. It simultaneously renders the ROV model onto this reconstructed surface, enhancing semantic perception and depth comprehension. The computations involved are closed-form and rely solely on egocentric views and monocular SLAM estimates, which makes it portable across existing teleoperation engines and robust to varying waterbody characteristics. We validate the geometric accuracy of our approach through extensive experiments of 2-DOF indoor navigation and 6-DOF underwater cave exploration in challenging low-light conditions. Quantitative metrics confirm the reliability of the rendered Exo views, while a user study involving 15 operators demonstrates improved situational awareness, navigation safety, and task efficiency during teleoperation. Furthermore, we highlight the role of EgoExo++ augmented visuals in supporting shared autonomy, operator training, and embodied teleoperation. This new interactive approach to ROV teleoperation presents promising opportunities for future research in subsea telerobotics.

EgoExo++: Integrating On-demand Exocentric Visuals with 2.5D Ground Surface Estimation for Interactive Teleoperation of Subsea ROVs

TL;DR

The paper addresses the limited situational awareness in egocentric underwater ROV teleoperation by introducing EgoExo++, a geometry-driven, real-time pipeline that synthesizes on-demand exocentric views and 2.5D ground textures from monocular SLAM data. It integrates exocentric view synthesis and ROV rendering into a SLAM backbone, enabling 2D and 2.5D perspectives (EgoExo and EgoExo++) that provide enhanced peripheral contextual information. The approach is validated through 2D indoor experiments and 3D underwater cave field trials, complemented by a user study indicating improved operator awareness, safety, and efficiency, and demonstrates potential for shared autonomy and training applications. The work highlights practical benefits for deep subsea missions while outlining future directions toward multi-sensor SLAM backbones and richer interactive and simulation capabilities.

Abstract

Underwater ROVs (Remotely Operated Vehicles) are indispensable for subsea exploration and task execution, yet typical teleoperation engines based on egocentric (first-person) video feeds restrict human operators' field-of-view and limit precise maneuvering in complex, unstructured underwater environments. To address this, we propose EgoExo, a geometry-driven solution integrated into a visual SLAM pipeline that synthesizes on-demand exocentric (third-person) views from egocentric camera feeds. Our proposed framework, EgoExo++, extends beyond 2D exocentric view synthesis (EgoExo) to augment a dense 2.5D ground surface estimation on-the-fly. It simultaneously renders the ROV model onto this reconstructed surface, enhancing semantic perception and depth comprehension. The computations involved are closed-form and rely solely on egocentric views and monocular SLAM estimates, which makes it portable across existing teleoperation engines and robust to varying waterbody characteristics. We validate the geometric accuracy of our approach through extensive experiments of 2-DOF indoor navigation and 6-DOF underwater cave exploration in challenging low-light conditions. Quantitative metrics confirm the reliability of the rendered Exo views, while a user study involving 15 operators demonstrates improved situational awareness, navigation safety, and task efficiency during teleoperation. Furthermore, we highlight the role of EgoExo++ augmented visuals in supporting shared autonomy, operator training, and embodied teleoperation. This new interactive approach to ROV teleoperation presents promising opportunities for future research in subsea telerobotics.
Paper Structure (16 sections, 11 equations, 8 figures, 5 tables)

This paper contains 16 sections, 11 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: The proposed teleoperation interface is demonstrated for an underwater cave exploration scenario with an ROV. The traditional console interfaces are based on egocentric views (top left), which are limiting and disorienting to a surface operator in noisy low-light conditions. Our EgoExo solution abdullah2024ego offers on-demand exocentric views from a fixed EOB (eye on the back) viewpoint, i.e., third-person views from behind the ROV (bottom left). In EgoExo++, we further integrate dynamic 2.5D exocentric views, with the ROV rendered above a textured ground surface. These interactive view options are integrated into a standard BlueROV2 console (by Blue Robotics Inc.) for a significantly improved teleoperation experience.
  • Figure 2: The computational pipeline is shown. From historical egocentric views and SLAM-derived poses, EgoExo computes a 2D exocentric image by applying pose geometry to project the ROV model; a sparse map of the environment is also constructed using SLAM-derived feature points. EgoExo++ reuses the feature points to fit a ground plane via RANSAC, then generates a textured 2.5D ground surface, and augments the ROV mesh to produce interactive exocentric views.
  • Figure 3: We conduct 2D indoor navigation experiments with a TurtleBot4 to validate the geometric accuracy of our algorithm; here, results are visualized for ground plane estimation and reprojection errors of known reference points in the scene.
  • Figure 4: EgoExo and EgoExo++ views are shown for field trials conducted in the Peacock Springs cave system, Florida. The EgoExo pipeline generates 2D exocentric imagery from directly behind the ROV, along with a sparse 3D map of the environment. The EgoExo++ extends it by reconstructing the ground surface and offering full $360\degree$ exocentric viewpoints.
  • Figure 5: A snapshot from our cave exploration scenario: (a) Egocentric view with detected reference points; and (b) Synthesized EgoExo view with projected ROV point cloud. We use a sample logo for homographic projection on the reference surface to demonstrate the accuracy in pose estimation.
  • ...and 3 more figures