Detection of Unknown Errors in Human-Centered Systems
Aranyak Maity, Ayan Banerjee, Sandeep Gupta
TL;DR
This paper addresses the challenge of detecting unknown errors in safety-critical human-centered AI systems that lack predefined error signatures. It introduces a two-stage, model-agnostic framework that learns physics-guided surrogate coefficients via a dynamics-induced hybrid RNN (DiH-RNN) and then applies conformal inference on the coefficient vector $\omega$ to establish a conformal range $d$, enabling STL-based safety checks on the coefficient model. By focusing conformance on model coefficients rather than outputs, the method aims to detect unknown-unknown errors earlier than traditional runtime monitors, with concrete penalties captured by the robustness of STL properties. Applied to automated insulin delivery, aircraft pitch control, and autonomous driving, the approach achieves early, high-precision detection with reported $PPV$ values up to $100\%$ across multiple unknown-error scenarios, demonstrating practical potential for safer real-world deployments. The work highlights the value of coefficient-centric, physics-guided monitoring for improving safety in real-time, model-agnostic settings and suggests directions for data-efficiency and broader validation.
Abstract
Artificial Intelligence-enabled systems are increasingly being deployed in real-world safety-critical settings involving human participants. It is vital to ensure the safety of such systems and stop the evolution of the system with error before causing harm to human participants. We propose a model-agnostic approach to detecting unknown errors in such human-centered systems without requiring any knowledge about the error signatures. Our approach employs dynamics-induced hybrid recurrent neural networks (DiH-RNN) for constructing physics-based models from operational data, coupled with conformal inference for assessing errors in the underlying model caused by violations of physical laws, thereby facilitating early detection of unknown errors before unsafe shifts in operational data distribution occur. We evaluate our framework on multiple real-world safety critical systems and show that our technique outperforms the existing state-of-the-art in detecting unknown errors.
