Table of Contents
Fetching ...

Sonar-MASt3R: Real-Time Opti-Acoustic Fusion in Turbid, Unstructured Environments

Amy Phung, Richard Camilli

Abstract

Underwater intervention is an important capability in several marine domains, with numerous industrial, scientific, and defense applications. However, existing perception systems used during intervention operations rely on data from optical cameras, which limits capabilities in poor visibility or lighting conditions. Prior work has examined opti-acoustic fusion methods, which use sonar data to resolve the depth ambiguity of the camera data while using camera data to resolve the elevation angle ambiguity of the sonar data. However, existing methods cannot achieve dense 3D reconstructions in real-time, and few studies have reported results from applying these methods in a turbid environment. In this work, we propose the opti-acoustic fusion method Sonar-MASt3R, which uses MASt3R to extract dense correspondences from optical camera data in real-time and pairs it with geometric cues from an acoustic 3D reconstruction to ensure robustness in turbid conditions. Experimental results using data recorded from an opti-acoustic eye-in-hand configuration across turbidity values ranging from <0.5 to >12 NTU highlight this method's improved robustness to turbidity relative to baseline methods.

Sonar-MASt3R: Real-Time Opti-Acoustic Fusion in Turbid, Unstructured Environments

Abstract

Underwater intervention is an important capability in several marine domains, with numerous industrial, scientific, and defense applications. However, existing perception systems used during intervention operations rely on data from optical cameras, which limits capabilities in poor visibility or lighting conditions. Prior work has examined opti-acoustic fusion methods, which use sonar data to resolve the depth ambiguity of the camera data while using camera data to resolve the elevation angle ambiguity of the sonar data. However, existing methods cannot achieve dense 3D reconstructions in real-time, and few studies have reported results from applying these methods in a turbid environment. In this work, we propose the opti-acoustic fusion method Sonar-MASt3R, which uses MASt3R to extract dense correspondences from optical camera data in real-time and pairs it with geometric cues from an acoustic 3D reconstruction to ensure robustness in turbid conditions. Experimental results using data recorded from an opti-acoustic eye-in-hand configuration across turbidity values ranging from <0.5 to >12 NTU highlight this method's improved robustness to turbidity relative to baseline methods.
Paper Structure (15 sections, 12 equations, 6 figures, 2 tables)

This paper contains 15 sections, 12 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The optical camera intrinsics and extrinsics are used to render a depth image from the acoustic 3D reconstruction, which is used to correct the scale of the pointmap computed by MASt3R
  • Figure 2: (left) Before and (right) after applying histogram equalization to an optical image of a granite boulder in $\sim$8 NTU
  • Figure 3: An "opti-acoustic eye-in-hand" configuration is used to record the datasets. The sonar is mounted level relative to the manipulator's wrist, and the camera is mounted 30 degrees off of vertical.
  • Figure 4: Illustration of different turbidity levels within each of the recorded datasets. NTU values: (a) $<0.5$, (b) 0.83, (c) 2.61, (d) 3.92, (e) 5.41, (f) 7.84, (g) 11.31, (h) 12.39
  • Figure 5: Optical 3D reconstruction results from (a) Sonar-MASt3R, (b) Metashape, and (c) MASt3R-SLAM using datasets A (1), C (2), E (3), and F (4) with turbidity values from $<$0.5 to 8 NTU. A 1-meter grid is included in Sonar-MASt3R's results for scale. No scale reference is provided for Metashape or MASt3R-SLAM since these methods do not produce metric-scale reconstructions.
  • ...and 1 more figures