FastRSR: Efficient and Accurate Road Surface Reconstruction from Bird's Eye View
Yuting Zhao, Yuheng Ji, Xiaoshuai Hao, Shuxiao Li
TL;DR
The paper addresses the challenge of accurate and efficient road surface reconstruction (RSR) from Bird's Eye View (BEV) by introducing Depth-Aware Projection (DAP) to mitigate information loss in view transformations, and two BEV-based models, FastRSR-mono and FastRSR-stereo. It couples DAP with Shuttle-shape Discretization (SD) to generate dense, elevation-aware BEV features from monocular input, and augments stereo BEV with Spatial Attention Enhancement (SAE) and Confidence Attention Generation (CAG) to preserve speed while boosting accuracy. On the RSRD dataset, FastRSR-mono surpasses monocular baselines by over 6 percentage points in elevation Abs. err and FastRSR-stereo achieves at least a 3× speedup over existing stereo methods while attaining the lowest elevation error among BEV stereo models, demonstrating strong practical impact for real-time autonomous driving. The approach balances accuracy and efficiency, provides an end-to-end trainable framework with LiDAR-based supervision, and offers significant improvements over prior BEV RSR methods, establishing a new strong baseline for BEV-based road surface analysis. Key contributions include a fast, depth-guided 3D-to-2D projection (DAP) with a pre-computed look-up table, a nonuniform elevation discretization strategy (SD), and two attention-based refinements (SAE and CAG) that collectively enable accurate elevation reconstruction in BEV at real-time speeds. The results indicate substantial practical benefits for autonomous driving in terms of safety and comfort, enabling reliable road surface assessment in dynamic environments.
Abstract
Road Surface Reconstruction (RSR) is crucial for autonomous driving, enabling the understanding of road surface conditions. Recently, RSR from the Bird's Eye View (BEV) has gained attention for its potential to enhance performance. However, existing methods for transforming perspective views to BEV face challenges such as information loss and representation sparsity. Moreover, stereo matching in BEV is limited by the need to balance accuracy with inference speed. To address these challenges, we propose two efficient and accurate BEV-based RSR models: FastRSR-mono and FastRSR-stereo. Specifically, we first introduce Depth-Aware Projection (DAP), an efficient view transformation strategy designed to mitigate information loss and sparsity by querying depth and image features to aggregate BEV data within specific road surface regions using a pre-computed look-up table. To optimize accuracy and speed in stereo matching, we design the Spatial Attention Enhancement (SAE) and Confidence Attention Generation (CAG) modules. SAE adaptively highlights important regions, while CAG focuses on high-confidence predictions and filters out irrelevant information. FastRSR achieves state-of-the-art performance, exceeding monocular competitors by over 6.0% in elevation absolute error and providing at least a 3.0x speedup by stereo methods on the RSRD dataset. The source code will be released.
