Repeatable and Reliable Efforts of Accelerated Risk Assessment in Robot Testing
Linda Capito, Guillermo A. Castillo, Bowen Weng
TL;DR
Risk assessment of robots in controlled environments requires repeatability across trials and reliability across diverse subjects. The paper formalizes $β$-repeatability and $γ$-reliability for sampling-based IS risk estimation and proposes a provably repeatable, reliable accelerated testing algorithm with a main procedure and a theoretical guarantee linked to the KL divergence between the nominal and importance distributions. A reproducible statistical query adaptation is incorporated to ensure stable outputs, while the sample-size bounds tie testing effort to distributional divergence and tail behavior. Empirical demonstrations on an inverted pendulum and a Rabbit legged robot pushover show near-perfect repeatability and reliability, outperforming RHW-based termination that yields non-repeatable results. The work enables standardized, fair, and efficient robot risk testing across vendors with provable guarantees and has potential to extend to other performance measures beyond risk.
Abstract
Risk assessment of a robot in controlled environments, such as laboratories and proving grounds, is a common means to assess, certify, validate, verify, and characterize the robots' safety performance before, during, and even after their commercialization in the real-world. A standard testing program that acquires the risk estimate is expected to be (i) repeatable, such that it obtains similar risk assessments of the same testing subject among multiple trials or attempts with the similar testing effort by different stakeholders, and (ii) reliable against a variety of testing subjects produced by different vendors and manufacturers. Both repeatability and reliability are fundamental and crucial for a testing algorithm's validity, fairness, and practical feasibility, especially for standardization. However, these properties are rarely satisfied or ensured, especially as the subject robots become more complex, uncertain, and varied. This issue was present in traditional risk assessments through Monte-Carlo sampling, and remains a bottleneck for the recent accelerated risk assessment methods, primarily those using importance sampling. This study aims to enhance existing accelerated testing frameworks by proposing a new algorithm that provably integrates repeatability and reliability with the already established formality and efficiency. It also features demonstrations assessing the risk of instability from frontal impacts, initiated by push-over disturbances on a controlled inverted pendulum and a 7-DoF planar bipedal robot Rabbit managed by various control algorithms.
