Table of Contents
Fetching ...

TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving

Yanping Fu, Xinyuan Liu, Tianyu Li, Yike Ma, Yucheng Zhang, Feng Dai

TL;DR

Topology reasoning for autonomous driving often suffers from endpoint deviation due to misalignment between lane endpoints and lane queries. TopoPoint addresses this by explicitly detecting endpoints and jointly reasoning over endpoints, lanes, and traffic cues using Point-Lane Merge Self-Attention and a Point-Lane Graph Convolutional Network, with a Geometry Matching refinement during inference. The framework yields state-of-the-art OpenLane-V2 performance (OLS) and substantially improves endpoint detection (DET$_p$), demonstrating that explicit endpoint modeling and geometry-aware interaction enhance both perception and topology reasoning. This approach can improve downstream tasks such as HD-map learning and planning by producing more consistent lane topology and more reliable endpoint localization.

Abstract

Topology reasoning, which unifies perception and structured reasoning, plays a vital role in understanding intersections for autonomous driving. However, its performance heavily relies on the accuracy of lane detection, particularly at connected lane endpoints. Existing methods often suffer from lane endpoints deviation, leading to incorrect topology construction. To address this issue, we propose TopoPoint, a novel framework that explicitly detects lane endpoints and jointly reasons over endpoints and lanes for robust topology reasoning. During training, we independently initialize point and lane query, and proposed Point-Lane Merge Self-Attention to enhance global context sharing through incorporating geometric distances between points and lanes as an attention mask . We further design Point-Lane Graph Convolutional Network to enable mutual feature aggregation between point and lane query. During inference, we introduce Point-Lane Geometry Matching algorithm that computes distances between detected points and lanes to refine lane endpoints, effectively mitigating endpoint deviation. Extensive experiments on the OpenLane-V2 benchmark demonstrate that TopoPoint achieves state-of-the-art performance in topology reasoning (48.8 on OLS). Additionally, we propose DET$_p$ to evaluate endpoint detection, under which our method significantly outperforms existing approaches (52.6 v.s. 45.2 on DET$_p$). The code is released at https://github.com/Franpin/TopoPoint.

TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving

TL;DR

Topology reasoning for autonomous driving often suffers from endpoint deviation due to misalignment between lane endpoints and lane queries. TopoPoint addresses this by explicitly detecting endpoints and jointly reasoning over endpoints, lanes, and traffic cues using Point-Lane Merge Self-Attention and a Point-Lane Graph Convolutional Network, with a Geometry Matching refinement during inference. The framework yields state-of-the-art OpenLane-V2 performance (OLS) and substantially improves endpoint detection (DET), demonstrating that explicit endpoint modeling and geometry-aware interaction enhance both perception and topology reasoning. This approach can improve downstream tasks such as HD-map learning and planning by producing more consistent lane topology and more reliable endpoint localization.

Abstract

Topology reasoning, which unifies perception and structured reasoning, plays a vital role in understanding intersections for autonomous driving. However, its performance heavily relies on the accuracy of lane detection, particularly at connected lane endpoints. Existing methods often suffer from lane endpoints deviation, leading to incorrect topology construction. To address this issue, we propose TopoPoint, a novel framework that explicitly detects lane endpoints and jointly reasons over endpoints and lanes for robust topology reasoning. During training, we independently initialize point and lane query, and proposed Point-Lane Merge Self-Attention to enhance global context sharing through incorporating geometric distances between points and lanes as an attention mask . We further design Point-Lane Graph Convolutional Network to enable mutual feature aggregation between point and lane query. During inference, we introduce Point-Lane Geometry Matching algorithm that computes distances between detected points and lanes to refine lane endpoints, effectively mitigating endpoint deviation. Extensive experiments on the OpenLane-V2 benchmark demonstrate that TopoPoint achieves state-of-the-art performance in topology reasoning (48.8 on OLS). Additionally, we propose DET to evaluate endpoint detection, under which our method significantly outperforms existing approaches (52.6 v.s. 45.2 on DET). The code is released at https://github.com/Franpin/TopoPoint.

Paper Structure

This paper contains 19 sections, 23 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Pipeline Comparison. (a) In the previous pipeline, lanes are predicted independently, which leads to obvious endpoint deviation. (b) In our proposed pipeline, lane endpoints are explicitly modeled, and lanes with overlapping endpoints are obtained through point-lane geometry matching.
  • Figure 2: TopoPoint framework. (a) In addition to the traffic elements and lanes, lane endpoints are also explicitly perceived in the detector. (b) The geometric attention bias is also incorporated into the point-lane merge self attention module to exchange information. (c) On this basis, the queries are used for topology reasoning, and the topology is also used for query enhancement in scene graph network. (d) During inference, point-lane result fusion is applied to eliminate endpoint deviation.
  • Figure 3: Module details. (a) Based on geometric attention bias and reasoned topology, lane & point queries are enhanced from the associated traffic elements & lanes & points by the unified scene graph network, (b) where the PLGCN is designed for better interaction between lanes and points.
  • Figure 4: Qualitative comparison of TopoLogic and our TopoPoint. The first row denotes multi-view inputs, and the second row denotes lane detection result with lane topology result. In the graph form of lane topology, node indicates lane while edge indicates lane topology, where green/red/blue color respectively indicates the correct/wrong/missed prediction.
  • Figure 5: Additional qualitative comparison of TopoLogic and TopoPoint. The first row denotes multi-view inputs, the second row denotes the endpoint detection and lane detection results, where the lane endpoints are indicated by red dots. The third row denotes the lane-lane topology result, and the last row denotes traffic element detection and lane-traffic topology results in the front-view.
  • ...and 1 more figures