Table of Contents
Fetching ...

Accurately Predicting Probabilities of Safety-Critical Rare Events for Intelligent Systems

Ruoxuan Bai, Jingxuan Yang, Weiduo Gong, Yi Zhang, Qiujing Lu, Shuo Feng

TL;DR

The paper addresses the challenge of accurately predicting safety-critical rare events in high-dimensional intelligent systems, introducing the concept of criticality as $\mathbb{P}(A|\mathbf{X})$ and tackling extreme data imbalance (IR $>10^4$) via a three-stage, data-densifying framework. It combines an unsupervised reward-based stage, an enhanced bilateral-branch supervised stage, and a dense reinforcement-learning stage to progressively improve precision and recall. Empirical evaluation on Lunar Lander and Bipedal Walker demonstrates substantial gains in AUC and detection rates, with dense DQN yielding near-perfect performance on Lunar Lander and meaningful improvements on the walker task. The approach provides a practical path toward reliable criticality assessment in real-world autonomous systems, with potential extensions in theoretical analysis and applicability to higher IR regimes.

Abstract

Intelligent systems are increasingly integral to our daily lives, yet rare safety-critical events present significant latent threats to their practical deployment. Addressing this challenge hinges on accurately predicting the probability of safety-critical events occurring within a given time step from the current state, a metric we define as 'criticality'. The complexity of predicting criticality arises from the extreme data imbalance caused by rare events in high dimensional variables associated with the rare events, a challenge we refer to as the curse of rarity. Existing methods tend to be either overly conservative or prone to overlooking safety-critical events, thus struggling to achieve both high precision and recall rates, which severely limits their applicability. This study endeavors to develop a criticality prediction model that excels in both precision and recall rates for evaluating the criticality of safety-critical autonomous systems. We propose a multi-stage learning framework designed to progressively densify the dataset, mitigating the curse of rarity across stages. To validate our approach, we evaluate it in two cases: lunar lander and bipedal walker scenarios. The results demonstrate that our method surpasses traditional approaches, providing a more accurate and dependable assessment of criticality in intelligent systems.

Accurately Predicting Probabilities of Safety-Critical Rare Events for Intelligent Systems

TL;DR

The paper addresses the challenge of accurately predicting safety-critical rare events in high-dimensional intelligent systems, introducing the concept of criticality as and tackling extreme data imbalance (IR ) via a three-stage, data-densifying framework. It combines an unsupervised reward-based stage, an enhanced bilateral-branch supervised stage, and a dense reinforcement-learning stage to progressively improve precision and recall. Empirical evaluation on Lunar Lander and Bipedal Walker demonstrates substantial gains in AUC and detection rates, with dense DQN yielding near-perfect performance on Lunar Lander and meaningful improvements on the walker task. The approach provides a practical path toward reliable criticality assessment in real-world autonomous systems, with potential extensions in theoretical analysis and applicability to higher IR regimes.

Abstract

Intelligent systems are increasingly integral to our daily lives, yet rare safety-critical events present significant latent threats to their practical deployment. Addressing this challenge hinges on accurately predicting the probability of safety-critical events occurring within a given time step from the current state, a metric we define as 'criticality'. The complexity of predicting criticality arises from the extreme data imbalance caused by rare events in high dimensional variables associated with the rare events, a challenge we refer to as the curse of rarity. Existing methods tend to be either overly conservative or prone to overlooking safety-critical events, thus struggling to achieve both high precision and recall rates, which severely limits their applicability. This study endeavors to develop a criticality prediction model that excels in both precision and recall rates for evaluating the criticality of safety-critical autonomous systems. We propose a multi-stage learning framework designed to progressively densify the dataset, mitigating the curse of rarity across stages. To validate our approach, we evaluate it in two cases: lunar lander and bipedal walker scenarios. The results demonstrate that our method surpasses traditional approaches, providing a more accurate and dependable assessment of criticality in intelligent systems.
Paper Structure (10 sections, 15 equations, 6 figures)

This paper contains 10 sections, 15 equations, 6 figures.

Figures (6)

  • Figure 1: Overview of the multi-stage learning framework. Our approach consists three stages. In the first stage, we remove those obvious non-critical samples and reduce imbalance ratio by unsupervised learning. Then in the second stage, we use labeled data to train a supervised classification model to further categorize those samples unable to be distinguished by unsupervised learning model. Lastly, in the third stage, we turn to improve the accuracy of predicted criticality other than continuously focusing on unclassified samples. Dense DQN method is developed to fine-tune last layers of classification model.
  • Figure 2: Structure of reward model in stage one. The reward model is composed of the backbone and linear layer mapping features to a scalar $r$. Positive and negative sample pairs are taken as input. The model is trained to obtain higher $r$ for positive samples than negative samples.
  • Figure 3: Structure of enhanced BBN in stage two. The upper branch focus on the representation learning of positive samples with class-balanced sampling and focal loss, while the lower branch focus on the representation learning of negative samples by uniform sampling and cross-entropy loss. Then features is mixed with adaptive parameter $a$. Finally a normalized classifier is adopted to mitigate the model's preference for negative samples.
  • Figure 4: Overview of cases
  • Figure 5: Outputs of reward model
  • ...and 1 more figures