Few-Shot Testing of Autonomous Vehicles with Scenario Similarity Learning
Shu Li, Honglin He, Jingxuan Yang, Jianming Hu, Yi Zhang, Shuo Feng
TL;DR
The paper tackles the challenge of evaluating autonomous vehicles when testing budgets are severely limited, by reframing scenario generation as deterministic optimization over a fixed few-shot testing set. It introduces a cross-attention similarity network to learn scenario-space relations and to optimally fuse test results, yielding an upper bound on evaluation error that can be minimized even with very small numbers of tests. The method relies on a surrogate-model ensemble to capture AV variation, and on a gradient-based optimization of the FST scenario set to maximize information gain. Experimental results in cut-in scenarios demonstrate substantial improvements in accuracy and reliability over traditional Monte Carlo, uniform sampling, and prior FST approaches, including an ideal upper-bound demonstration and comprehensive ablations. This work provides a practical framework for rapid, explainable AV testing under budget constraints and suggests scalable paths for larger, more complex testing domains.
Abstract
Testing and evaluation are critical to the development and deployment of autonomous vehicles (AVs). Given the rarity of safety-critical events such as crashes, millions of tests are typically needed to accurately assess AV safety performance. Although techniques like importance sampling can accelerate this process, it usually still requires too many numbers of tests for field testing. This severely hinders the testing and evaluation process, especially for third-party testers and governmental bodies with very limited testing budgets. The rapid development cycles of AV technology further exacerbate this challenge. To fill this research gap, this paper introduces the few-shot testing (FST) problem and proposes a methodological framework to tackle it. As the testing budget is very limited, usually smaller than 100, the FST method transforms the testing scenario generation problem from probabilistic sampling to deterministic optimization, reducing the uncertainty of testing results. To optimize the selection of testing scenarios, a cross-attention similarity mechanism is proposed to learn to extract the information of AV's testing scenario space. This allows iterative searches for scenarios with the smallest evaluation error, ensuring precise testing within budget constraints. Experimental results in cut-in scenarios demonstrate the effectiveness of the FST method, significantly enhancing accuracy and enabling efficient, precise AV testing.
