Table of Contents
Fetching ...

GLane3D : Detecting Lanes with Graph of 3D Keypoints

Halil İbrahim Öztürk, Muhammet Esat Kalfaoğlu, Ozsel Kilinc

TL;DR

GLane3D tackles 3D lane detection with a camera-only system by modeling lanes as a directed graph of 3D keypoints and learning robust keypoint connections. It introduces multiple proposals per keypoint, a PointNMS stage to balance recall and efficiency, and a directed adjacency learning mechanism that enables lane extraction from BEV graphs, aided by a PV-to-BEV projection with a customized BEV geometry. The method balances recall and speed through a total proposal budget $N = S × n$ and an efficient matching/loss scheme, showing state-of-the-art F1 scores on OpenLane and Apollo with strong cross-dataset generalization, including camera-only and camera+LiDAR fusion variants. The IPM-based BEV sampling densifies near-ego regions, improving localization, while cross-dataset tests demonstrate GLane3D's robust generalization beyond the training domain. Overall, GLane3D offers a practical, scalable solution for real-time 3D lane detection with strong generalization capabilities across diverse driving scenarios.

Abstract

Accurate and efficient lane detection in 3D space is essential for autonomous driving systems, where robust generalization is the foremost requirement for 3D lane detection algorithms. Considering the extensive variation in lane structures worldwide, achieving high generalization capacity is particularly challenging, as algorithms must accurately identify a wide variety of lane patterns worldwide. Traditional top-down approaches rely heavily on learning lane characteristics from training datasets, often struggling with lanes exhibiting previously unseen attributes. To address this generalization limitation, we propose a method that detects keypoints of lanes and subsequently predicts sequential connections between them to construct complete 3D lanes. Each key point is essential for maintaining lane continuity, and we predict multiple proposals per keypoint by allowing adjacent grids to predict the same keypoint using an offset mechanism. PointNMS is employed to eliminate overlapping proposal keypoints, reducing redundancy in the estimated BEV graph and minimizing computational overhead from connection estimations. Our model surpasses previous state-of-the-art methods on both the Apollo and OpenLane datasets, demonstrating superior F1 scores and a strong generalization capacity when models trained on OpenLane are evaluated on the Apollo dataset, compared to prior approaches.

GLane3D : Detecting Lanes with Graph of 3D Keypoints

TL;DR

GLane3D tackles 3D lane detection with a camera-only system by modeling lanes as a directed graph of 3D keypoints and learning robust keypoint connections. It introduces multiple proposals per keypoint, a PointNMS stage to balance recall and efficiency, and a directed adjacency learning mechanism that enables lane extraction from BEV graphs, aided by a PV-to-BEV projection with a customized BEV geometry. The method balances recall and speed through a total proposal budget and an efficient matching/loss scheme, showing state-of-the-art F1 scores on OpenLane and Apollo with strong cross-dataset generalization, including camera-only and camera+LiDAR fusion variants. The IPM-based BEV sampling densifies near-ego regions, improving localization, while cross-dataset tests demonstrate GLane3D's robust generalization beyond the training domain. Overall, GLane3D offers a practical, scalable solution for real-time 3D lane detection with strong generalization capabilities across diverse driving scenarios.

Abstract

Accurate and efficient lane detection in 3D space is essential for autonomous driving systems, where robust generalization is the foremost requirement for 3D lane detection algorithms. Considering the extensive variation in lane structures worldwide, achieving high generalization capacity is particularly challenging, as algorithms must accurately identify a wide variety of lane patterns worldwide. Traditional top-down approaches rely heavily on learning lane characteristics from training datasets, often struggling with lanes exhibiting previously unseen attributes. To address this generalization limitation, we propose a method that detects keypoints of lanes and subsequently predicts sequential connections between them to construct complete 3D lanes. Each key point is essential for maintaining lane continuity, and we predict multiple proposals per keypoint by allowing adjacent grids to predict the same keypoint using an offset mechanism. PointNMS is employed to eliminate overlapping proposal keypoints, reducing redundancy in the estimated BEV graph and minimizing computational overhead from connection estimations. Our model surpasses previous state-of-the-art methods on both the Apollo and OpenLane datasets, demonstrating superior F1 scores and a strong generalization capacity when models trained on OpenLane are evaluated on the Apollo dataset, compared to prior approaches.

Paper Structure

This paper contains 29 sections, 12 equations, 9 figures, 12 tables.

Figures (9)

  • Figure 1: The overall architecture. GLane3D is a keypoint-based 3D lane detection method. A directed graph of 3D keypoints is used to generate lane instances. In part (a), multiple proposals improve the recall of keypoint detection, while in part (b), PointNMS selects the strongest proposals to reduce ambiguity in the directed graph. The estimated adjacency matrix, shown in part (c), enables extraction of directed connections.
  • Figure 2: Keypoints over BEV space. Multiple keypoint proposals $K_P$ for a target point at lane (a), regressed offsets $\Delta{x}$ for keypoint proposals (b), strongest proposals $K_S$ after PointNMS (c), predicted directed connections $C$ between strongest keypoints (d). Colors are for visualization and do not indicate class labels.
  • Figure 3: Uniformly distributed points projected to the front view (a) and represented in the bird's-eye view (b). Customized points projected to the front view (c) and represented in the bird's-eye view (d).
  • Figure 4: Qualitative evaluation on OpenLane val set. The rows illustrate (a) ground truth 3D lanes, prediction from (b) LATRluo2023latr and (c) GLane3D with 2D projection, respectively. Here, different colors indicate specific categories. Row (d) demonstrates the ground truth (red) and prediction of GLane3D (green) in 3D space. Best viewed in color (zoom in for details).
  • Figure 5: Qualitative results on cross dataset evaluation on Apollo validation set of Balanced Scenes. The rows illustrate prediction from (a) PersFormerchen2022persformer, (b) LATRluo2023latr and (c) GLane3D with 2D projection, respectively. Best viewed in color (zoom in for details).
  • ...and 4 more figures