Accurately Predicting Probabilities of Safety-Critical Rare Events for Intelligent Systems
Ruoxuan Bai, Jingxuan Yang, Weiduo Gong, Yi Zhang, Qiujing Lu, Shuo Feng
TL;DR
The paper addresses the challenge of accurately predicting safety-critical rare events in high-dimensional intelligent systems, introducing the concept of criticality as $\mathbb{P}(A|\mathbf{X})$ and tackling extreme data imbalance (IR $>10^4$) via a three-stage, data-densifying framework. It combines an unsupervised reward-based stage, an enhanced bilateral-branch supervised stage, and a dense reinforcement-learning stage to progressively improve precision and recall. Empirical evaluation on Lunar Lander and Bipedal Walker demonstrates substantial gains in AUC and detection rates, with dense DQN yielding near-perfect performance on Lunar Lander and meaningful improvements on the walker task. The approach provides a practical path toward reliable criticality assessment in real-world autonomous systems, with potential extensions in theoretical analysis and applicability to higher IR regimes.
Abstract
Intelligent systems are increasingly integral to our daily lives, yet rare safety-critical events present significant latent threats to their practical deployment. Addressing this challenge hinges on accurately predicting the probability of safety-critical events occurring within a given time step from the current state, a metric we define as 'criticality'. The complexity of predicting criticality arises from the extreme data imbalance caused by rare events in high dimensional variables associated with the rare events, a challenge we refer to as the curse of rarity. Existing methods tend to be either overly conservative or prone to overlooking safety-critical events, thus struggling to achieve both high precision and recall rates, which severely limits their applicability. This study endeavors to develop a criticality prediction model that excels in both precision and recall rates for evaluating the criticality of safety-critical autonomous systems. We propose a multi-stage learning framework designed to progressively densify the dataset, mitigating the curse of rarity across stages. To validate our approach, we evaluate it in two cases: lunar lander and bipedal walker scenarios. The results demonstrate that our method surpasses traditional approaches, providing a more accurate and dependable assessment of criticality in intelligent systems.
