SCOUT: A Lightweight Framework for Scenario Coverage Assessment in Autonomous Driving
Anil Yildiz, Sarah M. Thornton, Carl Hildebrandt, Sreeja Roy-Singh, Mykel J. Kochenderfer
TL;DR
SCOUT introduces a lightweight surrogate model that estimates autonomous driving scenario coverage from precomputed latent sensor representations, trained via distillation from a fine-tuned LVLM. By using latent features and a labeled LVLM teacher, SCOUT delivers near-LVLM coverage accuracy with orders-of-magnitude faster inference and far lower memory requirements, enabling on-vehicle, large-scale coverage analysis. The approach demonstrates strong agreement with human annotations, substantial efficiency gains, and robustness to class imbalance, establishing a practical pathway for scalable safety evaluation in real-world driving. The work highlights the value of leveraging perception-stack features and LVLM-informed supervision to achieve scalable, accurate scenario coverage oversight for autonomous systems.
Abstract
Assessing scenario coverage is crucial for evaluating the robustness of autonomous agents, yet existing methods rely on expensive human annotations or computationally intensive Large Vision-Language Models (LVLMs). These approaches are impractical for large-scale deployment due to cost and efficiency constraints. To address these shortcomings, we propose SCOUT (Scenario Coverage Oversight and Understanding Tool), a lightweight surrogate model designed to predict scenario coverage labels directly from an agent's latent sensor representations. SCOUT is trained through a distillation process, learning to approximate LVLM-generated coverage labels while eliminating the need for continuous LVLM inference or human annotation. By leveraging precomputed perception features, SCOUT avoids redundant computations and enables fast, scalable scenario coverage estimation. We evaluate our method across a large dataset of real-life autonomous navigation scenarios, demonstrating that it maintains high accuracy while significantly reducing computational cost. Our results show that SCOUT provides an effective and practical alternative for large-scale coverage analysis. While its performance depends on the quality of LVLM-generated training labels, SCOUT represents a major step toward efficient scenario coverage oversight in autonomous systems.
