Online Temporal Fusion for Vectorized Map Construction in Mapless Autonomous Driving
Jiagang Chen, Liangliang Pan, Shunping Ji, Ji Zhao, Zichao Zhang
TL;DR
This work tackles the challenge of mapless autonomous driving by building temporally consistent vectorized maps online from onboard detections. It introduces a semantic voxel hashing framework that incrementally fuses road-marking detections into a sparse 3D voxel map, extracts reliable voxels, and clusters them into polyline road markings, which are then transformed into lane boundaries and linkages using domain knowledge. The approach yields lane-level, vectorized road layouts suitable for planning and control, and shows stronger stability and geometric accuracy than single-frame methods across urban scenarios, validated on in-house and Argoverse2 datasets with real-time performance on embedded hardware. The results suggest a practical path toward reducing HD-map dependence in mapless autonomous driving while enabling robust PnC integration; future work includes reducing reliance on SD maps and incorporating uncertainty into fusion.
Abstract
To reduce the reliance on high-definition (HD) maps, a growing trend in autonomous driving is leveraging onboard sensors to generate vectorized maps online. However, current methods are mostly constrained by processing only single-frame inputs, which hampers their robustness and effectiveness in complex scenarios. To overcome this problem, we propose an online map construction system that exploits the long-term temporal information to build a consistent vectorized map. First, the system efficiently fuses all historical road marking detections from an off-the-shelf network into a semantic voxel map, which is implemented using a hashing-based strategy to exploit the sparsity of road elements. Then reliable voxels are found by examining the fused information and incrementally clustered into an instance-level representation of road markings. Finally, the system incorporates domain knowledge to estimate the geometric and topological structures of roads, which can be directly consumed by the planning and control (PnC) module. Through experiments conducted in complicated urban environments, we have demonstrated that the output of our system is more consistent and accurate than the network output by a large margin and can be effectively used in a closed-loop autonomous driving system.
