Table of Contents
Fetching ...

Graph-based Topology Reasoning for Driving Scenes

Tianyu Li, Li Chen, Huijie Wang, Yang Li, Jiazhi Yang, Xiangwei Geng, Shengyin Jiang, Yuting Wang, Hang Xu, Chunjing Xu, Junchi Yan, Ping Luo, Hongyang Li

TL;DR

This work introduces TopoNet, an end-to-end framework that reasons about driving scene topology by jointly predicting lane connectivity and lane-to-traffic-element associations. It integrates a Scene Graph Neural Network for inter-entity message passing and a Scene Knowledge Graph to inject semantic priors, enabling explicit topology reasoning beyond traditional perception. Evaluations on OpenLane-V2 show substantial improvements in both centerline perception and topology reasoning, with ablations validating the benefits of the KG-enhanced information flow. The approach provides a scalable pathway toward richer topological understanding for downstream planning and motion prediction in autonomous driving, with code released for reproducibility.

Abstract

Understanding the road genome is essential to realize autonomous driving. This highly intelligent problem contains two aspects - the connection relationship of lanes, and the assignment relationship between lanes and traffic elements, where a comprehensive topology reasoning method is vacant. On one hand, previous map learning techniques struggle in deriving lane connectivity with segmentation or laneline paradigms; or prior lane topology-oriented approaches focus on centerline detection and neglect the interaction modeling. On the other hand, the traffic element to lane assignment problem is limited in the image domain, leaving how to construct the correspondence from two views an unexplored challenge. To address these issues, we present TopoNet, the first end-to-end framework capable of abstracting traffic knowledge beyond conventional perception tasks. To capture the driving scene topology, we introduce three key designs: (1) an embedding module to incorporate semantic knowledge from 2D elements into a unified feature space; (2) a curated scene graph neural network to model relationships and enable feature interaction inside the network; (3) instead of transmitting messages arbitrarily, a scene knowledge graph is devised to differentiate prior knowledge from various types of the road genome. We evaluate TopoNet on the challenging scene understanding benchmark, OpenLane-V2, where our approach outperforms all previous works by a great margin on all perceptual and topological metrics. The code is released at https://github.com/OpenDriveLab/TopoNet

Graph-based Topology Reasoning for Driving Scenes

TL;DR

This work introduces TopoNet, an end-to-end framework that reasons about driving scene topology by jointly predicting lane connectivity and lane-to-traffic-element associations. It integrates a Scene Graph Neural Network for inter-entity message passing and a Scene Knowledge Graph to inject semantic priors, enabling explicit topology reasoning beyond traditional perception. Evaluations on OpenLane-V2 show substantial improvements in both centerline perception and topology reasoning, with ablations validating the benefits of the KG-enhanced information flow. The approach provides a scalable pathway toward richer topological understanding for downstream planning and motion prediction in autonomous driving, with code released for reproducibility.

Abstract

Understanding the road genome is essential to realize autonomous driving. This highly intelligent problem contains two aspects - the connection relationship of lanes, and the assignment relationship between lanes and traffic elements, where a comprehensive topology reasoning method is vacant. On one hand, previous map learning techniques struggle in deriving lane connectivity with segmentation or laneline paradigms; or prior lane topology-oriented approaches focus on centerline detection and neglect the interaction modeling. On the other hand, the traffic element to lane assignment problem is limited in the image domain, leaving how to construct the correspondence from two views an unexplored challenge. To address these issues, we present TopoNet, the first end-to-end framework capable of abstracting traffic knowledge beyond conventional perception tasks. To capture the driving scene topology, we introduce three key designs: (1) an embedding module to incorporate semantic knowledge from 2D elements into a unified feature space; (2) a curated scene graph neural network to model relationships and enable feature interaction inside the network; (3) instead of transmitting messages arbitrarily, a scene knowledge graph is devised to differentiate prior knowledge from various types of the road genome. We evaluate TopoNet on the challenging scene understanding benchmark, OpenLane-V2, where our approach outperforms all previous works by a great margin on all perceptual and topological metrics. The code is released at https://github.com/OpenDriveLab/TopoNet
Paper Structure (26 sections, 18 equations, 5 figures, 8 tables)

This paper contains 26 sections, 18 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Topology relationship of driving scenes. While driving into an intersection, the self-driving vehicle has to reason about the correct lane and traffic information for downstream navigation. We advocate, and present TopoNet, to directly achieve topology understanding on the heterogeneous graph. "Topology LL" and "Topology LT" represent the relationship among lane centerlines and the relationship between lane centerlines and traffic elements respectively.
  • Figure 2: Systematic diagram of TopoNet. TopoNet addresses the crucial problem of topology reasoning for driving scenes in an end-to-end fashion. It consists of four stages, with the last three being compacted in a Transformer decoder architecture. TopoNet handles traffic elements and centerlines as two parallel branches at the Deformable decoder stage. Various types of instance queries (red, blue) then interact, exchange messages, acquire and aggregate prominent knowledge in the proposed Scene Graph Neural Network stage. The explicit relationship modeling inside the network serves as a favorable scheme for feature learning and topology prediction. We abbreviate traffic elements and lane centerlines as "TE" and "LC" in this paper, respectively.
  • Figure 3: Scene knowledge graph illustration. For the centerline colored blue in the left case, related weight matrices in the graph are categorically independent. Different traffic elements and lane-directed connections bring different information to the centerline, which is encoded as a scene knowledge graph on the right.
  • Figure 4: Qualitative results of TopoNet and other algorithms. While driving in complex scenarios, TopoNet achieves superior lane graph prediction performance compared to other SOTA methods. It also successfully builds all connections between traffic elements and lanes (top right, and correspondingly colored lines in BEV). Colors denote categories of traffic elements.
  • Figure 5: Failure case under large-area occlusion. TopoNet fails to predict centerlines and the lane graph in the intersection with a large bus colluding in front. Note that the relationship between the left lane and the red light is an incorrect annotation where our algorithm reasons about the direction of the left lane and avoids the false positive prediction.