Rethinking Lanes and Points in Complex Scenarios for Monocular 3D Lane Detection
Yifan Chang, Junjie Huang, Xiaofeng Wang, Yun Ye, Zhujin Liang, Yi Shan, Dalong Du, Xingang Wang
TL;DR
This work identifies fundamental flaws in sparse-point monocular 3D lane detection, showing that endpoint truncation in training ground truth can introduce substantial errors. It introduces an endpoint patching strategy and an EndPoint head (EP-head) to predict patching distances, enabling more complete lane representations with fewer preset points. To further leverage lane geometry, the authors propose PointLane attention (PL-attention), which integrates within-lane, cross-lane, and same-y interactions as priors in the attention mechanism. Across multiple state-of-the-art baselines on OpenLane, EP-head and PL-attention yield consistent improvements in F1-score (e.g., +4.4 on Persformer, +3.2 on Anchor3DLane, +2.8 on LATR), demonstrating enhanced robustness in complex scenarios and potential applicability to 2D lane detection and HD map construction.
Abstract
Monocular 3D lane detection is a fundamental task in autonomous driving. Although sparse-point methods lower computational load and maintain high accuracy in complex lane geometries, current methods fail to fully leverage the geometric structure of lanes in both lane geometry representations and model design. In lane geometry representations, we present a theoretical analysis alongside experimental validation to verify that current sparse lane representation methods contain inherent flaws, resulting in potential errors of up to 20 m, which raise significant safety concerns for driving. To address this issue, we propose a novel patching strategy to completely represent the full lane structure. To enable existing models to match this strategy, we introduce the EndPoint head (EP-head), which adds a patching distance to endpoints. The EP-head enables the model to predict more complete lane representations even with fewer preset points, effectively addressing existing limitations and paving the way for models that are faster and require fewer parameters in the future. In model design, to enhance the model's perception of lane structures, we propose the PointLane attention (PL-attention), which incorporates prior geometric knowledge into the attention mechanism. Extensive experiments demonstrate the effectiveness of the proposed methods on various state-of-the-art models. For instance, in terms of the overall F1-score, our methods improve Persformer by 4.4 points, Anchor3DLane by 3.2 points, and LATR by 2.8 points. The code will be available soon.
