Table of Contents
Fetching ...

TopoLogic: An Interpretable Pipeline for Lane Topology Reasoning on Driving Scenes

Yanping Fu, Wenbin Liao, Xinyuan Liu, Hang xu, Yike Ma, Feng Dai, Yucheng Zhang

TL;DR

This work targets the problem of lane topology reasoning in autonomous driving, identifying that prior approaches overly hinge on perception improvements and MLP-based connectivity that are brittle to endpoint shifts. It introduces TopoLogic, an interpretable pipeline that reasons topology from two signals: geometric distances between lane endpoints and semantic similarity of lane queries, fused with learnable weights and augmented by a GNN to propagate topology-aware features. Empirical results on OpenLane-V2 show substantial improvements in lane topology metrics (TOP$_{ll}$ and OLS) over prior methods, and demonstrate that the geometric-distance cue can boost already-trained models when used as post-processing. The approach provides interpretability through explicit geometric and semantic channels and yields robust lane topology reasoning in complex driving scenes, with limitations mainly in not dramatically elevating detection on its own.

Abstract

As an emerging task that integrates perception and reasoning, topology reasoning in autonomous driving scenes has recently garnered widespread attention. However, existing work often emphasizes "perception over reasoning": they typically boost reasoning performance by enhancing the perception of lanes and directly adopt MLP to learn lane topology from lane query. This paradigm overlooks the geometric features intrinsic to the lanes themselves and are prone to being influenced by inherent endpoint shifts in lane detection. To tackle this issue, we propose an interpretable method for lane topology reasoning based on lane geometric distance and lane query similarity, named TopoLogic. This method mitigates the impact of endpoint shifts in geometric space, and introduces explicit similarity calculation in semantic space as a complement. By integrating results from both spaces, our methods provides more comprehensive information for lane topology. Ultimately, our approach significantly outperforms the existing state-of-the-art methods on the mainstream benchmark OpenLane-V2 (23.9 v.s. 10.9 in TOP$_{ll}$ and 44.1 v.s. 39.8 in OLS on subset_A. Additionally, our proposed geometric distance topology reasoning method can be incorporated into well-trained models without re-training, significantly boost the performance of lane topology reasoning. The code is released at https://github.com/Franpin/TopoLogic.

TopoLogic: An Interpretable Pipeline for Lane Topology Reasoning on Driving Scenes

TL;DR

This work targets the problem of lane topology reasoning in autonomous driving, identifying that prior approaches overly hinge on perception improvements and MLP-based connectivity that are brittle to endpoint shifts. It introduces TopoLogic, an interpretable pipeline that reasons topology from two signals: geometric distances between lane endpoints and semantic similarity of lane queries, fused with learnable weights and augmented by a GNN to propagate topology-aware features. Empirical results on OpenLane-V2 show substantial improvements in lane topology metrics (TOP and OLS) over prior methods, and demonstrate that the geometric-distance cue can boost already-trained models when used as post-processing. The approach provides interpretability through explicit geometric and semantic channels and yields robust lane topology reasoning in complex driving scenes, with limitations mainly in not dramatically elevating detection on its own.

Abstract

As an emerging task that integrates perception and reasoning, topology reasoning in autonomous driving scenes has recently garnered widespread attention. However, existing work often emphasizes "perception over reasoning": they typically boost reasoning performance by enhancing the perception of lanes and directly adopt MLP to learn lane topology from lane query. This paradigm overlooks the geometric features intrinsic to the lanes themselves and are prone to being influenced by inherent endpoint shifts in lane detection. To tackle this issue, we propose an interpretable method for lane topology reasoning based on lane geometric distance and lane query similarity, named TopoLogic. This method mitigates the impact of endpoint shifts in geometric space, and introduces explicit similarity calculation in semantic space as a complement. By integrating results from both spaces, our methods provides more comprehensive information for lane topology. Ultimately, our approach significantly outperforms the existing state-of-the-art methods on the mainstream benchmark OpenLane-V2 (23.9 v.s. 10.9 in TOP and 44.1 v.s. 39.8 in OLS on subset_A. Additionally, our proposed geometric distance topology reasoning method can be incorporated into well-trained models without re-training, significantly boost the performance of lane topology reasoning. The code is released at https://github.com/Franpin/TopoLogic.
Paper Structure (18 sections, 11 equations, 5 figures, 5 tables)

This paper contains 18 sections, 11 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Comparison of results with and without post-processing in TopoNet. We use a post-processing based on geometric distance to improve the lane topology reasoning performance of TopoNet. (a) denotes the ground truth of lane topology reasoing. (b) denotes the endpoints of two connected lanes in prediction do not overlap (marked with yellow circle) as desired in ground truth. (c) denotes the lane topology reasoning result of TopoNet, the arrow denotes lane topology (marked with red arrow). (d) denotes the lane topology reasoning result of TopoNet using post-processing, significantly improves the reasoning precision of lane topology.
  • Figure 2: Pipeline of TopoLogic. The overarching structure of TopoLogic comprises two main components: an image encoder for feature extraction and transformation, and a lane decoder responsible for end-to-end topology reasoning. This decoder utilizes the proposed lane geometric distance topology and lane similarity topology, and fuse them into the final lane topology, which is facilitated through GNN to augment lane learning in the next decoder layer.
  • Figure 3: Comparison of various mapping functions. Compared to $f_{gau},f_{sig},f_{tan}$, our proposed function $f_{ours}$ has greater tolerance for endpoint shift.
  • Figure 4: Influence of inaccuracies in lane detection on topology reasoning.Yellow denotes incorrect prediction of lane line, and red denotes incorrect prediction of lane topology.
  • Figure 5: Qualitative result about lane topology reasoning result of TopoNet and our TopoLogic. The first row denotes multi-view inputs. The second row denotes lane detection result and lane topology reasoning result. The third row denotes graph form of lane topology reasoning (node indicates lane line, edge indicates lane topology), where green color indicates the right prediction, while red color indicates the error prediction and blue color indicates missing prediction.