Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation
Mehdi Zayene, Jannik Endres, Albias Havolli, Charles Corbière, Salim Cherkaoui, Alexandre Kontouli, Alexandre Alahi
TL;DR
Helvipad provides a real-world omnidirectional stereo dataset to advance 360° depth estimation, offering pixel-wise labels derived from LiDAR projections and augmented density via depth completion. The authors adapt state-of-the-art stereo models to spherical geometry by incorporating a polar angle input and circular padding, introducing 360-IGEV-Stereo, which achieves superior performance on Helvipad. Comprehensive experiments show improved depth accuracy, boundary consistency, and cross-scene generalization, underscoring the dataset’s value for real-time navigation in indoor and outdoor human environments. The work establishes Helvipad as a robust testbed for developing and evaluating omnidirectional stereo methods and depth-perception pipelines.
Abstract
Despite progress in stereo depth estimation, omnidirectional imaging remains underexplored, mainly due to the lack of appropriate data. We introduce Helvipad, a real-world dataset for omnidirectional stereo depth estimation, featuring 40K video frames from video sequences across diverse environments, including crowded indoor and outdoor scenes with various lighting conditions. Collected using two 360° cameras in a top-bottom setup and a LiDAR sensor, the dataset includes accurate depth and disparity labels by projecting 3D point clouds onto equirectangular images. Additionally, we provide an augmented training set with an increased label density by using depth completion. We benchmark leading stereo depth estimation models for both standard and omnidirectional images. The results show that while recent stereo methods perform decently, a challenge persists in accurately estimating depth in omnidirectional imaging. To address this, we introduce necessary adaptations to stereo models, leading to improved performance.
