RomniStereo: Recurrent Omnidirectional Stereo Matching
Hualie Jiang, Rui Xu, Minglang Tan, Wenjie Jiang
TL;DR
RomniStereo addresses the challenge of effective $360^{\circ}$ depth sensing with a four-fisheye rig by introducing a RAFT-inspired recurrent update framework that bypasses costly 3D encoders. The method links spherical sweeping outputs to a 2D GRU through opposite adaptive weighting, grid embedding, and adaptive context generation, enabling end-to-end training and strong depth accuracy. Empirical results show a substantial average MAE improvement of $40.7\%$ over prior SOTA across five datasets, along with faster inference as the model capacity scales. This work advances practical omnidirectional depth sensing by combining geometry-aware feature fusion with efficient recurrent matching, suitable for robust robot navigation and related applications.
Abstract
Omnidirectional stereo matching (OSM) is an essential and reliable means for $360^{\circ}$ depth sensing. However, following earlier works on conventional stereo matching, prior state-of-the-art (SOTA) methods rely on a 3D encoder-decoder block to regularize the cost volume, causing the whole system complicated and sub-optimal results. Recently, the Recurrent All-pairs Field Transforms (RAFT) based approach employs the recurrent update in 2D and has efficiently improved image-matching tasks, ie, optical flow, and stereo matching. To bridge the gap between OSM and RAFT, we mainly propose an opposite adaptive weighting scheme to seamlessly transform the outputs of spherical sweeping of OSM into the required inputs for the recurrent update, thus creating a recurrent omnidirectional stereo matching (RomniStereo) algorithm. Furthermore, we introduce two techniques, ie, grid embedding and adaptive context feature generation, which also contribute to RomniStereo's performance. Our best model improves the average MAE metric by 40.7\% over the previous SOTA baseline across five datasets. When visualizing the results, our models demonstrate clear advantages on both synthetic and realistic examples. The code is available at \url{https://github.com/HalleyJiang/RomniStereo}.
