Bayesian Safety Validation for Failure Probability Estimation of Black-Box Systems

Robert J. Moss; Mykel J. Kochenderfer; Maxime Gariel; Arthur Dubois

Bayesian Safety Validation for Failure Probability Estimation of Black-Box Systems

Robert J. Moss, Mykel J. Kochenderfer, Maxime Gariel, Arthur Dubois

TL;DR

This work addresses the challenge of reliably estimating failure probabilities for safety-critical, black-box systems with expensive simulators. It introduces Bayesian Safety Validation (BSV), which uses a Gaussian-process surrogate and three acquisition functions (uncertainty exploration, boundary refinement, and failure region sampling) to efficiently locate failures and accurately estimate $p_{\text{fail}}$ via importance sampling. The method is demonstrated on toy, stochastic, and real-world runway detection problems, showing orders of magnitude fewer evaluations than Monte Carlo or PMC baselines and providing actionable insight into failure regions and the most-likely failures. The open-source Julia implementation facilitates adoption in aviation safety workflows and other safety-critical domains, supporting more rigorous certification and faster development cycles.

Abstract

Estimating the probability of failure is an important step in the certification of safety-critical systems. Efficient estimation methods are often needed due to the challenges posed by high-dimensional input spaces, risky test scenarios, and computationally expensive simulators. This work frames the problem of black-box safety validation as a Bayesian optimization problem and introduces a method that iteratively fits a probabilistic surrogate model to efficiently predict failures. The algorithm is designed to search for failures, compute the most-likely failure, and estimate the failure probability over an operating domain using importance sampling. We introduce three acquisition functions that aim to reduce uncertainty by covering the design space, optimize the analytically derived failure boundaries, and sample the predicted failure regions. Results show this Bayesian safety validation approach provides a more accurate estimate of failure probability with orders of magnitude fewer samples and performs well across various safety validation metrics. We demonstrate this approach on three test problems, a stochastic decision making system, and a neural network-based runway detection system. This work is open sourced (https://github.com/sisl/BayesianSafetyValidation.jl) and currently being used to supplement the FAA certification process of the machine learning components for an autonomous cargo aircraft.

Bayesian Safety Validation for Failure Probability Estimation of Black-Box Systems

TL;DR

via importance sampling. The method is demonstrated on toy, stochastic, and real-world runway detection problems, showing orders of magnitude fewer evaluations than Monte Carlo or PMC baselines and providing actionable insight into failure regions and the most-likely failures. The open-source Julia implementation facilitates adoption in aviation safety workflows and other safety-critical domains, supporting more rigorous certification and faster development cycles.

Abstract

Paper Structure (31 sections, 24 equations, 12 figures, 3 tables, 2 algorithms)

This paper contains 31 sections, 24 equations, 12 figures, 3 tables, 2 algorithms.

Introduction
Background
Safety Validation
Bayesian Optimization and Probabilistic Surrogate Models
Gaussian Processes
Predicting a probability with a Gaussian process.
Problem Formulation
Uncertainty exploration.
Boundary refinement.
Failure region sampling.
Importance Sampling Estimate of Failure Probability
Discrete proposal.
Self-normalizing importance sampling.
Proposed Algorithm: Bayesian Safety Validation
Experiments and Results
...and 16 more sections

Figures (12)

Figure 1: The three tasks of safety validation.
Figure 2: An example maximization problem using Gaussian process Bayesian optimization with UCB exploration.
Figure 3: Illustrating 90 steps of the failure search and refinement acquisition functions.
Figure 4: The proposed Bayesian safety validation (BSV) algorithm used for all three safety validation tasks.
Figure 5: Failure regions and operational models for the three test problems.
...and 7 more figures

Bayesian Safety Validation for Failure Probability Estimation of Black-Box Systems

TL;DR

Abstract

Bayesian Safety Validation for Failure Probability Estimation of Black-Box Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (12)