Table of Contents
Fetching ...

Fine-Grained Representation for Lane Topology Reasoning

Guoqing Xu, Yiheng Li, Yang Yang

TL;DR

TopoFG tackles the challenge of reliable lane topology reasoning in autonomous driving by replacing single holistic lane queries with fine-grained, spatially aware queries guided by hierarchical priors. The method integrates a Hierarchical Prior Extractor, a Region-Focused Decoder, and Robust Boundary-Point Topology Reasoning, plus a denoising strategy for stable supervision of topology. The approach achieves state-of-the-art performance on OpenLane-V2 (OLS 48.0% on subset_A and 45.4% on subset_B) by better capturing local lane geometry and boundary-point connectivity. This work enhances robustness of perception in complex lane configurations, supporting safer path planning and control.

Abstract

Precise modeling of lane topology is essential for autonomous driving, as it directly impacts navigation and control decisions. Existing methods typically represent each lane with a single query and infer topological connectivity based on the similarity between lane queries. However, this kind of design struggles to accurately model complex lane structures, leading to unreliable topology prediction. In this view, we propose a Fine-Grained lane topology reasoning framework (TopoFG). It divides the procedure from bird's-eye-view (BEV) features to topology prediction via fine-grained queries into three phases, i.e., Hierarchical Prior Extractor (HPE), Region-Focused Decoder (RFD), and Robust Boundary-Point Topology Reasoning (RBTR). Specifically, HPE extracts global spatial priors from the BEV mask and local sequential priors from in-lane keypoint sequences to guide subsequent fine-grained query modeling. RFD constructs fine-grained queries by integrating the spatial and sequential priors. It then samples reference points in RoI regions of the mask and applies cross-attention with BEV features to refine the query representations of each lane. RBTR models lane connectivity based on boundary-point query features and further employs a topological denoising strategy to reduce matching ambiguity. By integrating spatial and sequential priors into fine-grained queries and applying a denoising strategy to boundary-point topology reasoning, our method precisely models complex lane structures and delivers trustworthy topology predictions. Extensive experiments on the OpenLane-V2 benchmark demonstrate that TopoFG achieves new state-of-the-art performance, with an OLS of 48.0 on subsetA and 45.4 on subsetB.

Fine-Grained Representation for Lane Topology Reasoning

TL;DR

TopoFG tackles the challenge of reliable lane topology reasoning in autonomous driving by replacing single holistic lane queries with fine-grained, spatially aware queries guided by hierarchical priors. The method integrates a Hierarchical Prior Extractor, a Region-Focused Decoder, and Robust Boundary-Point Topology Reasoning, plus a denoising strategy for stable supervision of topology. The approach achieves state-of-the-art performance on OpenLane-V2 (OLS 48.0% on subset_A and 45.4% on subset_B) by better capturing local lane geometry and boundary-point connectivity. This work enhances robustness of perception in complex lane configurations, supporting safer path planning and control.

Abstract

Precise modeling of lane topology is essential for autonomous driving, as it directly impacts navigation and control decisions. Existing methods typically represent each lane with a single query and infer topological connectivity based on the similarity between lane queries. However, this kind of design struggles to accurately model complex lane structures, leading to unreliable topology prediction. In this view, we propose a Fine-Grained lane topology reasoning framework (TopoFG). It divides the procedure from bird's-eye-view (BEV) features to topology prediction via fine-grained queries into three phases, i.e., Hierarchical Prior Extractor (HPE), Region-Focused Decoder (RFD), and Robust Boundary-Point Topology Reasoning (RBTR). Specifically, HPE extracts global spatial priors from the BEV mask and local sequential priors from in-lane keypoint sequences to guide subsequent fine-grained query modeling. RFD constructs fine-grained queries by integrating the spatial and sequential priors. It then samples reference points in RoI regions of the mask and applies cross-attention with BEV features to refine the query representations of each lane. RBTR models lane connectivity based on boundary-point query features and further employs a topological denoising strategy to reduce matching ambiguity. By integrating spatial and sequential priors into fine-grained queries and applying a denoising strategy to boundary-point topology reasoning, our method precisely models complex lane structures and delivers trustworthy topology predictions. Extensive experiments on the OpenLane-V2 benchmark demonstrate that TopoFG achieves new state-of-the-art performance, with an OLS of 48.0 on subsetA and 45.4 on subsetB.

Paper Structure

This paper contains 19 sections, 7 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Comparison between existing methods and our method for lane topology reasoning. (a) Existing Methods: Using instance-level queries with coarse lane modeling and holistic topology reasoning, which may lead to incorrect predictions in complex scenes. (b) Our Method: Adopting fine-grained queries and boundary-point based topology reasoning for improved lane detection and topology prediction. The blue arrow on the right represents the lane topology connection, and the green dashed regions highlight the more reliable topology reasoning of our method.
  • Figure 2: Overview of the TopoFG framework. The framework consists of three modules. First, the Hierarchical Prior Extractor, which extracts both spatial and local priors. Second, the Region-Focused Decoder, which enhances local geometric modeling by focusing on key lane regions. Third, the Robust Boundary-Point Topology Reasoning module, which constructs lane connectivity based on boundary-points and incorporates denoising training to improve structural stability.
  • Figure 3: Qualitative comparison of different methods on the lane topology reasoning task. From left to right: multi-view input images, TopoNet, TopoMLP, TopoLogic, our proposed TopoFG, and the ground truth. The figure shows the predicted lane centerlines in the bird’s-eye view, where orange lines represent the predicted lane centerlines and blue arrows indicate the topological connections between lanes.