The Constant Eye: Benchmarking and Bridging Appearance Robustness in Autonomous Driving
Jiabao Wang, Hongyu Zhou, Yuanbo Yang, Jiahao Shao, Yiyi Liao
TL;DR
The Constant Eye paper introduces navdream, a benchmark that decouples appearance from geometry to quantify how appearance shifts affect planning in autonomous driving. By applying pixel-aligned style transfer to NAVSIM sequences, navdream serves as a visual stress test with preserved geometry, enabling precise assessment of appearance robustness. The authors propose a universal perception interface built on a frozen DINOv3 backbone, coupled with a lightweight adapter, enabling zero-shot generalization across regression, diffusion, and scoring-based planners without target-domain training. Experiments on navdream and NAVSIM show that the frozen-vision interface maintains stable planning under severe appearance shifts and across multiple planning paradigms, highlighting its practical potential for robust, scalable perception in autonomous driving.
Abstract
Despite rapid progress, autonomous driving algorithms remain notoriously fragile under Out-of-Distribution (OOD) conditions. We identify a critical decoupling failure in current research: the lack of distinction between appearance-based shifts, such as weather and lighting, and structural scene changes. This leaves a fundamental question unanswered: Is the planner failing because of complex road geometry, or simply because it is raining? To resolve this, we establish navdream, a high-fidelity robustness benchmark leveraging generative pixel-aligned style transfer. By creating a visual stress test with negligible geometric deviation, we isolate the impact of appearance on driving performance. Our evaluation reveals that existing planning algorithms often show significant degradation under OOD appearance conditions, even when the underlying scene structure remains consistent. To bridge this gap, we propose a universal perception interface leveraging a frozen visual foundation model (DINOv3). By extracting appearance-invariant features as a stable interface for the planner, we achieve exceptional zero-shot generalization across diverse planning paradigms, including regression-based, diffusion-based, and scoring-based models. Our plug-and-play solution maintains consistent performance across extreme appearance shifts without requiring further fine-tuning. The benchmark and code will be made available.
