Table of Contents
Fetching ...

Failure Probability Estimation for Black-Box Autonomous Systems using State-Dependent Importance Sampling Proposals

Harrison Delecki, Sydney M. Katz, Mykel J. Kochenderfer

TL;DR

This work tackles the problem of estimating rare failure probabilities in black-box, sequential autonomous systems. It introduces SPAIS, a state-dependent adaptive importance sampling method that decomposes trajectory sampling into per-timestep disturbances and optimizes the proposal by minimizing the forward KL divergence to a smoothed failure distribution, with gradients estimated via Markov score ascent. The approach demonstrates improved accuracy and reduced variance over Monte Carlo and existing AIS baselines across four validation problems, highlighting its scalability to long horizons and continuous state spaces. The findings support SPAIS as a practical, open-source tool for safety validation in safety-critical autonomous systems with complex dynamics.

Abstract

Estimating the probability of failure is a critical step in developing safety-critical autonomous systems. Direct estimation methods such as Monte Carlo sampling are often impractical due to the rarity of failures in these systems. Existing importance sampling approaches do not scale to sequential decision-making systems with large state spaces and long horizons. We propose an adaptive importance sampling algorithm to address these limitations. Our method minimizes the forward Kullback-Leibler divergence between a state-dependent proposal distribution and a relaxed form of the optimal importance sampling distribution. Our method uses Markov score ascent methods to estimate this objective. We evaluate our approach on four sequential systems and show that it provides more accurate failure probability estimates than baseline Monte Carlo and importance sampling techniques. This work is open sourced.

Failure Probability Estimation for Black-Box Autonomous Systems using State-Dependent Importance Sampling Proposals

TL;DR

This work tackles the problem of estimating rare failure probabilities in black-box, sequential autonomous systems. It introduces SPAIS, a state-dependent adaptive importance sampling method that decomposes trajectory sampling into per-timestep disturbances and optimizes the proposal by minimizing the forward KL divergence to a smoothed failure distribution, with gradients estimated via Markov score ascent. The approach demonstrates improved accuracy and reduced variance over Monte Carlo and existing AIS baselines across four validation problems, highlighting its scalability to long horizons and continuous state spaces. The findings support SPAIS as a practical, open-source tool for safety validation in safety-critical autonomous systems with complex dynamics.

Abstract

Estimating the probability of failure is a critical step in developing safety-critical autonomous systems. Direct estimation methods such as Monte Carlo sampling are often impractical due to the rarity of failures in these systems. Existing importance sampling approaches do not scale to sequential decision-making systems with large state spaces and long horizons. We propose an adaptive importance sampling algorithm to address these limitations. Our method minimizes the forward Kullback-Leibler divergence between a state-dependent proposal distribution and a relaxed form of the optimal importance sampling distribution. Our method uses Markov score ascent methods to estimate this objective. We evaluate our approach on four sequential systems and show that it provides more accurate failure probability estimates than baseline Monte Carlo and importance sampling techniques. This work is open sourced.

Paper Structure

This paper contains 27 sections, 18 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Comparison of the proposed approach and Monte Carlo (MC) sampling for estimating failure probabilities in an inverted pendulum with a rule-based controller under torque disturbances. Failures occur when the magnitude of the pendulum's angle from the vertical exceeds a threshold. The left figure shows the estimated failure probability using our method during training compared to MC. Our method is more accurate with fewer samples than MC. The middle two figures compare $200$ pendulum angle trajectories over time using samples from MC and from ours after training. Our method discovers failure trajectories (red) across modes, while all MC samples are safe (blue). The two right figures show torque disturbances applied at each state in the sampled trajectories. Our method learns to add positive torque disturbances for positive angles (left from vertical) and vice versa, enabling efficient sampling for failure probability estimation.
  • Figure 2: Illustration of $\tilde{p}_{\beta}(\tau)$.
  • Figure 3: The environments used in failure probability estimation experiments.
  • Figure 4: Visited states and applied disturbances by CEM (above) and SPAIS (below) after training.