Table of Contents
Fetching ...

Learning Lane Graphs from Aerial Imagery Using Transformers

Martin Büchner, Simon Dorer, Abhinav Valada

TL;DR

This work addresses the challenge of generating high-fidelity lane topologies for autonomous navigation from aerial imagery by predicting successor lane graphs. It introduces the Aerial Lane Graph Transformer (ALGT), a DETR-based framework that predicts a set of path proposals representing maximal-length traversals of the lane graph and aggregates them into a final directed graph. Ablation studies show that polyline path representations outperform Bézier parametrizations, that PSPNet-based backbones yield better features, and that increased encoder/decoder capacity improves path predictions, with ALGT achieving superior path-level accuracy (APLS) while remaining competitive on topology metrics compared to the LaneGNN baseline. The approach reduces node-position errors common in onboard-sensor methods and holds promise for robust planning in complex urban environments.

Abstract

The robust and safe operation of automated vehicles underscores the critical need for detailed and accurate topological maps. At the heart of this requirement is the construction of lane graphs, which provide essential information on lane connectivity, vital for navigating complex urban environments autonomously. While transformer-based models have been effective in creating map topologies from vehicle-mounted sensor data, their potential for generating such graphs from aerial imagery remains untapped. This work introduces a novel approach to generating successor lane graphs from aerial imagery, utilizing the advanced capabilities of transformer models. We frame successor lane graphs as a collection of maximal length paths and predict them using a Detection Transformer (DETR) architecture. We demonstrate the efficacy of our method through extensive experiments on the diverse and large-scale UrbanLaneGraph dataset, illustrating its accuracy in generating successor lane graphs and highlighting its potential for enhancing autonomous vehicle navigation in complex environments.

Learning Lane Graphs from Aerial Imagery Using Transformers

TL;DR

This work addresses the challenge of generating high-fidelity lane topologies for autonomous navigation from aerial imagery by predicting successor lane graphs. It introduces the Aerial Lane Graph Transformer (ALGT), a DETR-based framework that predicts a set of path proposals representing maximal-length traversals of the lane graph and aggregates them into a final directed graph. Ablation studies show that polyline path representations outperform Bézier parametrizations, that PSPNet-based backbones yield better features, and that increased encoder/decoder capacity improves path predictions, with ALGT achieving superior path-level accuracy (APLS) while remaining competitive on topology metrics compared to the LaneGNN baseline. The approach reduces node-position errors common in onboard-sensor methods and holds promise for robust planning in complex urban environments.

Abstract

The robust and safe operation of automated vehicles underscores the critical need for detailed and accurate topological maps. At the heart of this requirement is the construction of lane graphs, which provide essential information on lane connectivity, vital for navigating complex urban environments autonomously. While transformer-based models have been effective in creating map topologies from vehicle-mounted sensor data, their potential for generating such graphs from aerial imagery remains untapped. This work introduces a novel approach to generating successor lane graphs from aerial imagery, utilizing the advanced capabilities of transformer models. We frame successor lane graphs as a collection of maximal length paths and predict them using a Detection Transformer (DETR) architecture. We demonstrate the efficacy of our method through extensive experiments on the diverse and large-scale UrbanLaneGraph dataset, illustrating its accuracy in generating successor lane graphs and highlighting its potential for enhancing autonomous vehicle navigation in complex environments.
Paper Structure (13 sections, 2 equations, 5 figures, 2 tables)

This paper contains 13 sections, 2 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: We present aerial lane graph transformers (ALGT) for learning feasible high-fidelity traversals of successor lane graphs. The left image shows raw predicted traversals while their opacity represents the predicted probability score. The right image shows the thresholded and aggregated traversals forming a successor lane graph.
  • Figure 2: Visualization of decomposed successor lane graphs of the UrbanLaneGraph dataset lane-gnn. We choose Bézier curves of degree 10 and polylines from 20 sample points on this curve. All paths along with their Bézier control points are depicted in varying colors. The context part of the samples is darkened.
  • Figure 3: Overview of our ALGT model for successor lane graph prediction. An image backbone extracts relevant features from the context image that are passed through a transformer-based path predictor to produce successor lane graph proposals.
  • Figure 4: Qualitative results obtained by our ALGT model in comparison to the ground truth. The top row represents the ground truth, while the bottom row presents our model's predictions. Our proposed architecture predicts highly accurate lane graphs that do not suffer from sampled node positions as LaneGNN and shows high split detection accuracy. In general, the obtained graphs show a smooth characteristic.
  • Figure 5: Failure cases of the ALGT model compared to GT