Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction

Zongzheng Zhang; Xinrun Li; Sizhe Zou; Guoxuan Chi; Siqi Li; Xuchong Qiu; Guoliang Wang; Guantian Zheng; Leichen Wang; Hang Zhao; Hao Zhao

Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction

Zongzheng Zhang, Xinrun Li, Sizhe Zou, Guoxuan Chi, Siqi Li, Xuchong Qiu, Guoliang Wang, Guantian Zheng, Leichen Wang, Hang Zhao, Hao Zhao

TL;DR

Chameleon addresses lane topology extraction for mapless autonomous driving by integrating fast, VLM-driven program synthesis with a slow, dense-prompting VLM for corner cases. The method defines lane-to-lane and lane-to-element adjacencies and uses a fast-slow architecture to balance efficiency and accuracy, aided by a chain-of-thought reasoning process and a suite of VQA tasks. It introduces API, expert-rule, few-shot, and VQA prompts to tailor code generation and reasoning, and demonstrates competitive performance on OpenLane-V2 with favorable latency compared to dense prompting. The work provides a practical, few-shot approach that leverages visual inputs in symbolic reasoning, and releases data, code, and models for benchmarking in autonomous driving topology understanding.

Abstract

Lane topology extraction involves detecting lanes and traffic elements and determining their relationships, a key perception task for mapless autonomous driving. This task requires complex reasoning, such as determining whether it is possible to turn left into a specific lane. To address this challenge, we introduce neuro-symbolic methods powered by vision-language foundation models (VLMs). Existing approaches have notable limitations: (1) Dense visual prompting with VLMs can achieve strong performance but is costly in terms of both financial resources and carbon footprint, making it impractical for robotics applications. (2) Neuro-symbolic reasoning methods for 3D scene understanding fail to integrate visual inputs when synthesizing programs, making them ineffective in handling complex corner cases. To this end, we propose a fast-slow neuro-symbolic lane topology extraction algorithm, named Chameleon, which alternates between a fast system that directly reasons over detected instances using synthesized programs and a slow system that utilizes a VLM with a chain-of-thought design to handle corner cases. Chameleon leverages the strengths of both approaches, providing an affordable solution while maintaining high performance. We evaluate the method on the OpenLane-V2 dataset, showing consistent improvements across various baseline detectors. Our code, data, and models are publicly available at https://github.com/XR-Lee/neural-symbolic

Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction

TL;DR

Abstract

Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)