Range-Agnostic Multi-View Depth Estimation With Keyframe Selection

Andrea Conti; Matteo Poggi; Valerio Cambareri; Stefano Mattoccia

Range-Agnostic Multi-View Depth Estimation With Keyframe Selection

Andrea Conti, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia

TL;DR

RAMDepth tackles depth estimation from multiple posed views without relying on prior scene depth ranges. It introduces a range-agnostic, purely 2D framework that reverses the traditional pipeline by iteratively refining depth along epipolar lines using deformable, correlation-guided sampling, with $D^s$ updated by a GRU and final depth upsampled via convex upsampling. A key byproduct is per-view matchability scores, enabling ranking and potential pruning of source views to save computation. Across diverse datasets, RAMDepth achieves accurate depth without depth-range priors and generalizes to monocular video and stereo setups, while offering a practical mechanism to select informative views for efficient inference.

Abstract

Methods for 3D reconstruction from posed frames require prior knowledge about the scene metric range, usually to recover matching cues along the epipolar lines and narrow the search range. However, such prior might not be directly available or estimated inaccurately in real scenarios -- e.g., outdoor 3D reconstruction from video sequences -- therefore heavily hampering performance. In this paper, we focus on multi-view depth estimation without requiring prior knowledge about the metric range of the scene by proposing RAMDepth, an efficient and purely 2D framework that reverses the depth estimation and matching steps order. Moreover, we demonstrate the capability of our framework to provide rich insights about the quality of the views used for prediction. Additional material can be found on our project page https://andreaconti.github.io/projects/range_agnostic_multi_view_depth.

Range-Agnostic Multi-View Depth Estimation With Keyframe Selection

TL;DR

updated by a GRU and final depth upsampled via convex upsampling. A key byproduct is per-view matchability scores, enabling ranking and potential pruning of source views to save computation. Across diverse datasets, RAMDepth achieves accurate depth without depth-range priors and generalizes to monocular video and stereo setups, while offering a practical mechanism to select informative views for efficient inference.

Abstract

Paper Structure (5 sections, 4 equations, 6 figures, 5 tables)

This paper contains 5 sections, 4 equations, 6 figures, 5 tables.

Introduction
Related Work
Proposed Framework
Experimental Results
Conclusion

Figures (6)

Figure 1: Depth Estimation and 3D reconstruction with RAMDepth on Blended yao2020blendedmvs. On top: given five images of the same scene, our framework can estimate accurate depth maps through multi-view geometry without requiring any knowledge about the reference view depth range. At the bottom: the point cloud obtained from the prediction of the network and the respective ground-truth.
Figure 2: RAMDepth Architecture Description. Our model instantiates an initial depth map and builds a pair-wise correlation table between the target view and each source image (in dark and light blue). Then, deformable sampling is iteratively performed over it, and the depth state is updated accordingly. Final depth prediction is upsampled through convex upsampling.
Figure 3: Qualitative results on Blended. Our approach extracts consistent and visually pleasant depth maps, not showing any visible outliers as can be observed in competitor methods.
Figure 4: Keyframes Ranking. We plot RMSE achieved by dropping input views in random order (red) or according with the ranking information provided by RAMDepth (black).
Figure 5: TartanAir Qualitatives. TartanAir provides a wide range of complex environments, we provide a few examples along with the predictions by RAMDepth.
...and 1 more figures

Range-Agnostic Multi-View Depth Estimation With Keyframe Selection

TL;DR

Abstract

Range-Agnostic Multi-View Depth Estimation With Keyframe Selection

Authors

TL;DR

Abstract

Table of Contents

Figures (6)