Finding Unknown Unknowns using Cyber-Physical System Simulators (Extended Report)
Semaan Douglas Wehbe, Stanley Bak
TL;DR
The paper tackles finding unknown unknowns in cyber-physical systems by analyzing mode sequences produced by a black-box CPS simulator. It introduces the Convex Mode Sequence Assumption and models input points as convex regions corresponding to mode sequences, enabling skip-based sampling. Two accelerated testing algorithms, Convex Rejection Sampling (CRS) and Region Distance Maximization (RDM), select new inputs outside existing regions to maximize the number of distinct mode sequences $|Y_{\\kappa}|$ under a simulation budget $\\kappa$. Empirical results across Voronoi, Navigation, Gearbox Meshing, and Automatic Transmission benchmarks show substantial speedups over random sampling (up to tens to hundreds of times in some cases) and greater discovery of rare behaviors. The approach is specification-free and complements existing verification, falsification, and coverage-based testing methods in CPS analysis.
Abstract
Simulation-based approaches are among the most practical means to search for safety violations, bugs, and other unexpected events in cyber-physical systems (CPS). Where existing approaches search for simulations violating a formal specification or maximizing a notion of coverage, in this work we propose a new goal for testing: to discover unknown rare behaviors by examining discrete mode sequences. We assume a CPS simulator outputs mode information, and strive to explore the sequences of modes produced by varying the initial state or time-varying uncertainties. We hypothesize that rare mode sequences are often the most interesting to a designer, and we develop two accelerated sampling algorithms that speed up the process of finding such sequences. We evaluate our approach on several benchmarks, ranging from synthetic examples to Simulink diagrams of a CPS, demonstrating in some cases a speedup of over 100x compared with a random sampling strategy.
