MARVO: Marine-Adaptive Radiance-aware Visual Odometry
Sacchin Sundar, Atman Kikani, Aaliya Alam, Sumukh Shrote, A. Nayeemulla Khan, A. Shahina
TL;DR
MARVO tackles underwater visual odometry by fusing physics-aware front-end perception with a probabilistic visual–inertial–barometric backend and an offline RL-driven pose-graph optimizer. The front-end extends LoFTR with a Physics-Aware Radiance Adapter to compensate wavelength-dependent attenuation, enabling stable semi-dense correspondences under turbidity. The back-end uses a fixed-lag GTSAM estimator with PARA-enhanced visual factors and barometric depth, followed by RL-PGO to achieve globally consistent trajectories in SE(2) before restoring to SE(3). Synthetic data generated via SyreaNet plus real-field fine-tuning underpins robust training, with evaluations showing improved AUC, ATE, and drift over baselines on both synthetic and real underwater sequences. The work highlights practical gains for underwater robotics, while outlining limitations due to data scarcity and suggesting future work in 3D mapping and acoustic depth integration.
Abstract
Underwater visual localization remains challenging due to wavelength-dependent attenuation, poor texture, and non-Gaussian sensor noise. We introduce MARVO, a physics-aware, learning-integrated odometry framework that fuses underwater image formation modeling, differentiable matching, and reinforcement-learning optimization. At the front-end, we extend transformer-based feature matcher with a Physics Aware Radiance Adapter that compensates for color channel attenuation and contrast loss, yielding geometrically consistent feature correspondences under turbidity. These semi dense matches are combined with inertial and pressure measurements inside a factor-graph backend, where we formulate a keyframe-based visual-inertial-barometric estimator using GTSAM library. Each keyframe introduces (i) Pre-integrated IMU motion factors, (ii) MARVO-derived visual pose factors, and (iii) barometric depth priors, giving a full-state MAP estimate in real time. Lastly, we introduce a Reinforcement-Learningbased Pose-Graph Optimizer that refines global trajectories beyond local minima of classical least-squares solvers by learning optimal retraction actions on SE(2).
