Using Surprise Index for Competency Assessment in Autonomous Decision-Making
Akash Ratheesh, Ofer Dagan, Nisar R. Ahmed, Jay McMahon
TL;DR
The paper addresses evaluating competency of autonomous decision-making under uncertainty by introducing a Surprise Index (SI) that quantifies how surprising observed evidence is under a probabilistic model. It derives a closed-form SI for joint Gaussian evidence via the Mahalanobis distance, connects it to a chi-square limit, and extends to dynamic Gauss-Markov filtering with sigma-point uncertainty propagation. The authors validate the approach on a 2-D GPS localization task and apply it to a nonlinear spacecraft maneuver problem with an RL policy, showing SI tracks expected behavior under nominal conditions and flags deviations with lower SI, while discussing computational considerations. The work provides a probabilistic, interpretable complement to NEES/NIS tests for model validity and RL policy evaluation, with potential utility for tuning filters and planners in dynamic, uncertain environments.
Abstract
This paper considers the problem of evaluating an autonomous system's competency in performing a task, particularly when working in dynamic and uncertain environments. The inherent opacity of machine learning models, from the perspective of the user, often described as a `black box', poses a challenge. To overcome this, we propose using a measure called the Surprise index, which leverages available measurement data to quantify whether the dynamic system performs as expected. We show that the surprise index can be computed in closed form for dynamic systems when observed evidence in a probabilistic model if the joint distribution for that evidence follows a multivariate Gaussian marginal distribution. We then apply it to a nonlinear spacecraft maneuver problem, where actions are chosen by a reinforcement learning agent and show it can indicate how well the trajectory follows the required orbit.
