RoadBEV: Road Surface Reconstruction in Bird's Eye View
Tong Zhao, Lei Yang, Yichen Xie, Mingyu Ding, Masayoshi Tomizuka, Yintao Wei
TL;DR
This paper addresses road surface elevation reconstruction for autonomous driving by proposing BEV-based methods that estimate vertical road profiles directly in Bird's Eye View. It introduces two models, RoadBEV-mono and RoadBEV-stereo, which query image features through voxelized BEV representations and perform elevation estimation as bin-based classification, using a soft-argmin to obtain continuous elevations. On the RS RD dataset, RoadBEV-mono achieves about $1.83\,\text{cm}$ absolute error, while RoadBEV-stereo reaches about $0.50\,\text{cm}$, with stereo offering substantial accuracy gains at the cost of higher computation. The approach leverages a voxel-centric BEV volume and correlation-based cost volumes to suppress perspective distortions and tightly constrain elevation estimation, demonstrating practical viability for road preview in autonomous systems and opening avenues for sequence-based and texture-geometry joint reconstructions.
Abstract
Road surface conditions, especially geometry profiles, enormously affect driving performance of autonomous vehicles. Vision-based online road reconstruction promisingly captures road information in advance. Existing solutions like monocular depth estimation and stereo matching suffer from modest performance. The recent technique of Bird's-Eye-View (BEV) perception provides immense potential to more reliable and accurate reconstruction. This paper uniformly proposes two simple yet effective models for road elevation reconstruction in BEV named RoadBEV-mono and RoadBEV-stereo, which estimate road elevation with monocular and stereo images, respectively. The former directly fits elevation values based on voxel features queried from image view, while the latter efficiently recognizes road elevation patterns based on BEV volume representing correlation between left and right voxel features. Insightful analyses reveal their consistence and difference with the perspective view. Experiments on real-world dataset verify the models' effectiveness and superiority. Elevation errors of RoadBEV-mono and RoadBEV-stereo achieve 1.83 cm and 0.50 cm, respectively. Our models are promising for practical road preview, providing essential information for promoting safety and comfort of autonomous vehicles. The code is released at https://github.com/ztsrxh/RoadBEV
