World Models for Anomaly Detection during Model-Based Reinforcement Learning Inference
Fabian Domberg, Georg Schildbach
TL;DR
The paper addresses safety for learning-based controllers by using world-models during inference to monitor forecasted state trajectories against actual observations, triggering safety actions when discrepancies exceed thresholds. It extends DreamerV3-style world models to inference-time anomaly detection, confirming that both global and local environmental or actuator changes induce measurable prediction-errors across simulated and real robotic platforms. Key contributions include a horizon-based error formulation, visualization of prediction gaps in image space, and demonstrations of sim2real applicability that aid debugging and interpretability. The work suggests that, while not guaranteeing safety, inference-time world-model discrepancies offer a universal, task-agnostic mechanism to detect unfamiliar situations and guide corrective actions in real-world deployments.
Abstract
Learning-based controllers are often purposefully kept out of real-world applications due to concerns about their safety and reliability. We explore how state-of-the-art world models in Model-Based Reinforcement Learning can be utilized beyond the training phase to ensure a deployed policy only operates within regions of the state-space it is sufficiently familiar with. This is achieved by continuously monitoring discrepancies between a world model's predictions and observed system behavior during inference. It allows for triggering appropriate measures, such as an emergency stop, once an error threshold is surpassed. This does not require any task-specific knowledge and is thus universally applicable. Simulated experiments on established robot control tasks show the effectiveness of this method, recognizing changes in local robot geometry and global gravitational magnitude. Real-world experiments using an agile quadcopter further demonstrate the benefits of this approach by detecting unexpected forces acting on the vehicle. These results indicate how even in new and adverse conditions, safe and reliable operation of otherwise unpredictable learning-based controllers can be achieved.
