Adaptive Splitting of Reusable Temporal Monitors for Rare Traffic Violations
Craig Innes, Subramanian Ramamoorthy
TL;DR
This work targets the challenge of estimating rare safety-violation probabilities for autonomous-vehicle simulations with black-box perception and control. It introduces a hybrid framework that combines Adaptive Multi-level Splitting (AMS) with online monitoring of Signal Temporal Logic (STL) robustness, leveraging partial-trajectory caching via an online robustness metric $\mathcal{L}_n$ to reuse computations. A Perception Error Model (PEM) injects realistic sensor noise, while a Model Predictive Controller (MPC) governs highway maneuvers, forming a stochastic testbed. Empirical results on a lane-change scenario show STL-AMS yields accurate failure-probability estimates with fewer simulations than Monte Carlo and standard adaptive importance sampling baselines, demonstrating practical viability for testing AV pipelines against STL specifications.
Abstract
Autonomous Vehicles (AVs) are often tested in simulation to estimate the probability they will violate safety specifications. Two common issues arise when using existing techniques to produce this estimation: If violations occur rarely, simple Monte-Carlo sampling techniques can fail to produce efficient estimates; if simulation horizons are too long, importance sampling techniques (which learn proposal distributions from past simulations) can fail to converge. This paper addresses both issues by interleaving rare-event sampling techniques with online specification monitoring algorithms. We use adaptive multi-level splitting to decompose simulations into partial trajectories, then calculate the distance of those partial trajectories to failure by leveraging robustness metrics from Signal Temporal Logic (STL). By caching those partial robustness metric values, we can efficiently re-use computations across multiple sampling stages. Our experiments on an interstate lane-change scenario show our method is viable for testing simulated AV-pipelines, efficiently estimating failure probabilities for STL specifications based on real traffic rules. We produce better estimates than Monte-Carlo and importance sampling in fewer simulations.
