Table of Contents
Fetching ...

A Geometrically Consistent Matching Framework for Side-Scan Sonar Mapping

Can Lei, Hayat Rajani, Nuno Gracias, Rafael Garcia, Huigang Wang

TL;DR

This work tackles the challenge of robust side-scan sonar image matching amid view-dependent backscatter, shadows, and geometric distortion. It introduces a physics-guided, geometrically consistent framework that decomposes raw SSS imagery into seabed reflectivity $\rho(x,y)$, terrain elevation $z(x,y)$, and acoustic path loss $L(x,y)$ using a Lambertian-based model, and performs training-free matching on the reflectivity map using SuperPoint+MINIMA-LightGlue. Geometry-aware outlier rejection leverages the predicted shadow map and elevation-derived priors to remove mismatches in occluded or topographically inconsistent regions, followed by RANSAC-based homography estimation for registration and fusion. Quantitative and qualitative evaluations demonstrate improved matching accuracy, higher geometric consistency, and robustness to viewpoint variations compared with traditional, CNN-based, and Transformer-based state-of-the-art methods, highlighting the approach as data-efficient and physically interpretable for high-precision SSS mapping in complex seafloor environments. The framework advances practical seafloor mapping by delivering stable, cross-view correspondences with reduced reliance on large labeled datasets.

Abstract

Robust matching of side-scan sonar imagery remains a fundamental challenge in seafloor mapping due to view-dependent backscatter, shadows, and geometric distortion. This paper proposes a novel matching framework that combines physical decoupling and geometric consistency to enhance correspondence accuracy and consistency across viewpoints. A multi-branch network, derived from the Lambertian reflection model, decomposes raw sonar images into seabed reflectivity, terrain elevation, and acoustic path loss. The reflectivity map, serving as a stable matching domain, is used in conjunction with a training-free matching pipeline combining SuperPoint and MINIMA-LightGlue. Geometry-aware outlier rejection leverages both terrain elevation and its physically derived shadow map to further remove mismatches in acoustically occluded and topographically inconsistent regions, thereby improving registration accuracy. Quantitative and visual evaluations against traditional, CNN-, and Transformer-based state-of-the-art methods demonstrate that our method achieves lower matching error, higher geometric consistency, and greater robustness to viewpoint variations. The proposed approach provides a data-efficient, physically interpretable solution for high-precision side-scan sonar image matching in complex seafloor environments.

A Geometrically Consistent Matching Framework for Side-Scan Sonar Mapping

TL;DR

This work tackles the challenge of robust side-scan sonar image matching amid view-dependent backscatter, shadows, and geometric distortion. It introduces a physics-guided, geometrically consistent framework that decomposes raw SSS imagery into seabed reflectivity , terrain elevation , and acoustic path loss using a Lambertian-based model, and performs training-free matching on the reflectivity map using SuperPoint+MINIMA-LightGlue. Geometry-aware outlier rejection leverages the predicted shadow map and elevation-derived priors to remove mismatches in occluded or topographically inconsistent regions, followed by RANSAC-based homography estimation for registration and fusion. Quantitative and qualitative evaluations demonstrate improved matching accuracy, higher geometric consistency, and robustness to viewpoint variations compared with traditional, CNN-based, and Transformer-based state-of-the-art methods, highlighting the approach as data-efficient and physically interpretable for high-precision SSS mapping in complex seafloor environments. The framework advances practical seafloor mapping by delivering stable, cross-view correspondences with reduced reliance on large labeled datasets.

Abstract

Robust matching of side-scan sonar imagery remains a fundamental challenge in seafloor mapping due to view-dependent backscatter, shadows, and geometric distortion. This paper proposes a novel matching framework that combines physical decoupling and geometric consistency to enhance correspondence accuracy and consistency across viewpoints. A multi-branch network, derived from the Lambertian reflection model, decomposes raw sonar images into seabed reflectivity, terrain elevation, and acoustic path loss. The reflectivity map, serving as a stable matching domain, is used in conjunction with a training-free matching pipeline combining SuperPoint and MINIMA-LightGlue. Geometry-aware outlier rejection leverages both terrain elevation and its physically derived shadow map to further remove mismatches in acoustically occluded and topographically inconsistent regions, thereby improving registration accuracy. Quantitative and visual evaluations against traditional, CNN-, and Transformer-based state-of-the-art methods demonstrate that our method achieves lower matching error, higher geometric consistency, and greater robustness to viewpoint variations. The proposed approach provides a data-efficient, physically interpretable solution for high-precision side-scan sonar image matching in complex seafloor environments.

Paper Structure

This paper contains 43 sections, 18 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Overview of the proposed side-scan sonar image matching framework. It includes four modules: physical decomposition via a multi-branch network, training-free feature matching on seabed reflectivity, geometric outlier rejection using elevation and shadow priors, and homography estimation with RANSAC.
  • Figure 2: Overview of the proposed PhysDNet framework and the geometric definitions used in the physics-aware model. PhysDNet employs a three-branch architecture to decouple SSS images into reflectivity ($\rho$), terrain elevation ($z$), and path loss ($L$), guided by the Lambertian reflection model. A threshold-based shadow map provides weak supervision, while predicted elevation supports physical computation of $\cos\theta$ and a physics-driven shadow map.
  • Figure 3: Overview of the proposed feature extraction and matching pipeline. (a) SuperPoint-based keypoint and descriptor extraction from reflectivity image $\rho$, using a shared encoder and dual decoders. (b) Feature matching via MINIMA-LightGlue, where projected descriptors are refined through cross-attentional Transformer blocks to produce reliable one-to-one correspondences.
  • Figure 4: Shadow- and terrain-guided outlier removal strategy. Matched keypoints are filtered by excluding those located in predicted shadow regions and terrain underestimation zones, improving the reliability of cross-view correspondences.
  • Figure 5: Visualization of matching and registration results from our method: (a) shows the final matched keypoint pairs for 8 test cases, with keypoints and match lines clearly visualized; (b) presents the corresponding registration results, where each moving image is aligned based on a homography matrix estimated from the matched points in (a). A weighted image blending strategy (not true image fusion) is used to overlay the registered image onto the fixed image, for visual assessment of alignment accuracy. The first row in (b) shows registration based on USBL data, the second row shows results after USBL correction, and the third row shows registration using our method, demonstrating more accurate alignment.
  • ...and 2 more figures